Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - Alexei B.

Pages: [1] 2 3
1
Versions - Release History / PDC core version 2114 major release
« on: March 05, 2023, 08:57:13 PM »
Important
- New Search Engine link extraction algorithms to ensure compatibility with latest changes
- UACE (Universal  Anti-Cheat Engine) will now detect attempts to hide Plagiarism behind using homoglyphs (like Cyrillic а in English documents) and other such cheating attempts.

Many-many minor changes.

Due to the significance on the SE link extraction update, versions below 2114 will be prompted to update.

2
It is important to remember, that Plagiarism Detector's reports are to be properly reviewed and assessed by a human before making any final decisions on the level of Plagiarism and/or coincidental similarities present in a document. Not everything similarity detected by a program is intentional Plagiarism!

When reviewing a report myself, I usually pay attention to the following:
1. Was the proper preset used? (https://plagiarism-detector.com/smf_bb/index.php/topic,337.0.html)
2. How much Plagiarism is reported? After some practice you will get a feeling of a usual level of detection for checked documents. In my practice a document with 0% detection looks more suspicious than a document with some 5-10% (for example). Due to the nature of any language it is unlikely to have a text that has no similarities to other texts online.
3. How it is distributed within the document: big solid chunks are more likely to be real Plagiarism than short distributed sections "6 words there, 7 words there".
4. If those detected sections are any kind of set word sequences: names of organizations, books, literature or source mentions, etc.
5. Sources reported by the program. If there are a few sources with high similarity percentage reported - I will check them using the side-by-side comparison feature. If there are no "primary" sources - sources with low overall similarity score are more likely to be coincidental detections.
6. Are References properly used?
7. Are there any problems with possible resources listed. Things to note will include, for example, a very short list of possible resources or a high number of failed resources.


The criteria each client uses may be unique and take into consideration many things, including language, knowledge domain, Institution's requirements and type of document, so we don't push any kind "one rule fits all" solution in this regard. Instead we recommend to take you time to analyze you reports, use Side-By-Side Comparison feature to determine if a reported similarity is an intentional Plagiarism or not. After some practice it will be easy to split our reports that require special attention from those you can consider "clean".

You can always contact our support service to get qualified help in reviewing your reports.

3
Versions - Release History / PDC core versions 1030-1041
« on: November 22, 2017, 01:17:51 PM »
Important:
- Changed the initial source extraction algorithm. The new algorithm ensures possible sources are found, as the old one has siezed to function.
- Due to the framework changed, new versions give a different computer ID. This meanslicenses are to be re-initialized, if updating from an old version.

- Multiple fixes in localizations.
- Multiple fixes regading compatibility with both old and new version of Windows.
- Updated compatibility with secure resources to prevent failing of resources using new versions of TLS.
- Other changes.

4
Versions - Release History / PDC core versions 914-915
« on: March 25, 2016, 03:29:00 PM »
- Important: Fixed a problem, when clustering distance was the same for both presets (Arts&Sciences). Results from this version and on will be different from previous versions ad are to be more exact. Any feedback is appreciated.

- New long-awaited feature added for testing purposes: saving reports to PDF ! It will be in testing for some time and we hope it will be then added on a regular basis. Reports are uploaded to our servers for conversion and you then download the result.

5
Versions - Release History / PDC core versions 911-912
« on: March 22, 2016, 11:22:13 PM »
Mostly ARV changes:
- Added button to save the list of reports to CSV
- Added button to save the report to HTML file to a desired location
- Re-worked reports so that a notification is seen, when an incorrectly re-saved report is opened.
- Slightly changed coloring for reports with lots of failed resources (planning to divide fails due to a client computer from those generated by servers)
- New graph in reports, mouseover highlights top source for a selected section.
- Added a new panel below and changed elements distribution. More elements to come.

6
Versions - Release History / PDC core versions 888-889 - major update
« on: September 24, 2015, 03:43:16 PM »
Plagiarism Detector major update with core version 888:

Features:

01. Added check type selection window to the Step-by-Step Wizard. More details here:  http://www.plagiarism-detector.com/smf_bb/index.php/topic,337.
02. Added failed resources notification. More details here: http://www.plagiarism-detector.com/smf_bb/index.php/topic,48.0.html
03. New T-comparator is now used for comparing documents. It is more precise, yet creates more load on CPU and memory. Due to this change, results from previous versions can differ from current results.
04. Changed license information text on the mains screen to avoid misunderstanding

Bugfixes:
1. Fixed an issue, resulting from third-party changes, that made previous versions returning "no plagiarism" results for most document. Force-updating from previous versions due to this fix.

--
We are highly looking towards your feedback!

7
"Silver Bullets" / OR Section: "Check Type" Word-to-Word vs Re-Write
« on: September 18, 2015, 12:32:09 AM »
Plagiarism Detector team is proud to announce the absolutely awesome feature added to Plagiarism Detector – presets for different kinds of documents are now available “out of the box”!

Some theory behind this:
It has been some time since we first observed cases, when our usual Plagiarism detection algorithms provided unsatisfactory results for certain documents. Additional research highlighted some common features for such cases. These are the specific features of the text in the Arts subjects and Exact Sciences subjects, as we started to call them. While the language for these documents is the same, they usually have some very interesting characteristics, which require different attitude from the Plagiarism detection algorithms.

So, starting with version 885, a user can select the preferable algorithm of check in the Step-by-Step Wizard! Please select “detect Text Rewrite (maximum detection)” if you documents are in the Arts subjects or other similar, and “detect Word-to-Word (maximum exactness)” if your documents are in the Exact Sciences’ fields.

We really believe that this new feature (that we have never seen before elsewhere) will help our customers in the ever-ongoing struggle against copy-paste!


8
We are aware of the false-positive Norton alert about the setup-file (at the time of posting, v885).

You are most likely to see the WS.Reputation.1 as the alert reason. It looks like Norton react to our file, as we have released a new version and their anti-virus software is not "familiar" with it.

You can read more on the problem here:
http://community.norton.com/forums/clarification-wsreputation1-detection

There should be an option to remove the file from Quarantine, which would allow you to use it.

We are sure the install package for our software does not have any virus inside, as the online-service confirms:
https://www.virustotal.com/en/file/e500af153be86bd6ce305f1440f04f183a38eff8b059b04bcae01937dd59c4eb/analysis/

We see it possible later setups can also be falsely detected, so client are encouraged to use VirusTotal for checks.

We hope it helps.

9
For the version 848 there is a problem when there is just one report shown in the list of the reports and this report is opened each time any document is checked.

It happens when there are more then 200 reports in the report folder. An easy way around is moving older reports in a separate folder.

This will be fixed in the next version released.

10
We have received several requests regarding references in the footnotes of the docX documents not being detected.

The problem is this: text from footnotes is not extracted. It is easy to check - no links that are present in the footnotes, are present in the report. We can do nothing here, since even Microsoft iFilters don't do this.

We recommend adding all the links from footnotes in the end of the document as text. For example as a list of used resources. In this case it will work just fine (remember to have URL-references enables in the program).

I hope it helps.

11
Versions - Release History / PDC core version 838 - minor update
« on: March 22, 2014, 02:34:39 PM »
- Fixed an error with a document loading failure in Demo version
- Compare a document to a folder of documents now compares with all the documents in a folder and all sub-folders

12
Over the time of our work in the field of Plagiarism Detection we have answered a lot of questions like "Are our documents stored elsewhere?". The answer is always the same: Your documents don't leave your computer during Plagiarism Detector checks.

But not so long ago a user started a more general discussion about the safety of such products in general. Safety for Your Documents, for sure.

With a permission from that kind person, I will publish our conversation here (with little edition). It may be useful for those worried about the problem.

_____________________

User:

I have a question regarding this software..

If I make a test with my documents to test against plagiarism? Why my documents and my scientific works must remain on the server online
stored? This software must be 100 % confidential. No need to upload my scientific works on the server and stay there permanently.

PD Team:

No checked documents leave your computer when searching for plagiarism (except for fragments used for a search itself), nor do we have any database at our side with users' documents. Thus your documents are confidential.

You  may have misunderstood PDAS software description. But PDAS is a software that a client can use to store documents in his own database and check against them. It is not used for Plagiarism Detector Internet check.

User:

That's why I asked because has hovered the suspicion that these softwares for against plagiarism take us the documents which we scan with these softwares against plagiarism and you receive and publish them before us and upload them on the internet.. It is not good with our work.
On many websites people complains regarding this thing. And many do not want to scan for against plagiarism for this reason, because they worked in vain and software developers receive all scanned documents and they publish them or upload on the websites.
It is intellectual theft.

PD Team:

We didn't do any research on the way our competitors work, but we can assure you that no documents leave your computer during check with Plagiarism Detector.

The thing you are worried about would seem rather counter-productive for any service that takes care of its clients. Thus the first risk-factor I would predict is "free of charge". Meaning that I wouldn't trust any plagiarism-check service that is free to all. As a question stands: "what's their interest then?".

But for any serious and well-established service a revenue lost from unsatisfied customers leaving (and making bad PR) would be of more importance then their documents. At least I believe so.

Once again: with our software no documents leave your computer to be stored elsewhere.

As you have raised a serious problem, I kindly ask your permission to publish this conversation (text only, no names) on our forums, as it may be of interest for other people.

User:

I raised this issue because, as I said, has hovered the suspicion that these kind of softwares of against plagiarism steal our PC documents, scientific works etc, when when we add in the software the document and check and scan the document.
And now, I don't just mean just at this software Plagiarism Detector. Generally this kind of software that checks if the document is plagiarized or not, people worryes, that their work would be compromised in vain due of intelectual thieft, ie, the thieft of the software..
Yes, you can use these phrases and put on the application's forum, but without my name or my e-mail. Thank you in advance!

PD Team:

Well, I totally understand your concerns. So let me analyze the risks from my experience.

In addition to the already mentioned "free cheese can be a mouse-trap":

1. Software installed at your computer is more secure then some Internet-site, providing the check service. All the following is said about an installed software, since noone knows what happens server-side.

2. Having a certain skills, you can check what data leaves you computer during the check. I can't go into the detail of the algorithms we use, but to my knowledge one cannot reproduce the document with those search-requests generated by our software. Besides, any traffic-analysis will show the requests are going to different places (since several search-engines are used), thus separating the fragments even more. If such analysis shows a whole document uploaded somewhere in one piece -  it doesn't look secure.

3. Easier way: if a software is observed to heavily-load the CPU - it is working on your side. If the CPU is not loaded - the software is either not so good, or the document is uploaded to some server and is checked there. Evidently, less secure.

4. Additional data. You can always check the Internet for third-party sites mentioning the software you are interested about. The more "serious business" it looks - the more secure it is likely to be. As I have said before - one is unlikely to risk his profits from a well-established business with stealing clients' intellectual property. But if the site looks like a home-work of post-graduate student and noone has ever heard of this product before - well, use it at your own risk.

Besides, I have just consulted our RnD about this and they did provide some additions from their perspective:

1. It is rather unlikely that clients' works are stolen for their scientific value. More likely as a part of regular process of filling the database that documents are checked against. For a widely-used service it looks impossible to analyze all the incoming documents for scientific value.

2. Indeed, some services do store all the documents that are checked with them (we don't see it right to mention them). You can follow the  above mentioned list of criteria to reduce the chance to use such a service. Even if the service stores clients' documents, there are two options: documents are stored for the service internal use only (later documents are checked against old ones) or documents are later indexed by search engines, which is indeed a serious threat, as you document becomes publicly available.

3. Someone interested in a detailed research in the field can make a set of "trap-documents" that are 100% original and check them with different services. Then in about 1.5 month repeat the same. If a document remains 100% with the same service - no documents are stored. If it is found plagiarized by the same service, but none of the others - documents are stored in the internal database. But if different services start finding plagiarism in a clean document that was checked with once service only - those services are just mirrors of a singe document-storing server or the document became publicly available.

We hope you find this information useful.

13
FAQ - Frequently Asked Questions / OR Section: Excluded URLs List
« on: January 13, 2014, 03:44:17 AM »
At the last page of the Step-By-Step Wizard you can open the second tab, which is Exclusion lists. It allows you to edit two different list, giving specific details for the search engine.

Exclude list: if any exclusion mask in this list is present in the found page URL - the page is ignored during check.
For example, adding the title of the document to the list will ignore all the Internet sources, that use the document name in the page address.
Example:
If the exclusion list contains "wikipedia" then ALL URLs to wikipedia.org will be ignored.

14
+ "Detect URL references" now works fine. If a document contains a section with a direct link to the Internet page containing it - the section is marked as Referenced (if the check-box is checked)
+ Exclude list now works as intended: if any word in this list is present in the found page URL - the page is ignored during check
+ Include list now works as intended: any URL in the list is thoroughly checked against during the search.
+ "Check against folder of documents" will now always ignore the file that is checked, if it is contained in the same folder.
+ Improved PDAS compatibility.

Planned for the next minor release: faster loading of Available Reports list.

15
FAQ - Frequently Asked Questions / Generation Time and Date
« on: November 13, 2013, 01:52:28 PM »
This section shows the time and date of the document check.

Pages: [1] 2 3