News:

SMF - Just Installed!

Main Menu

Recent posts

#1
Versions - Release History / PDC core version 2867
Last post by Alexei B. - February 06, 2025, 12:48:06 PM
- Caches reworked
- Added automated cache  clearing options
- Fixed the problem with checks failed on "read only" files.

minor fixes
#2
Versions - Release History / PDC core version 2848
Last post by Alexei B. - January 29, 2025, 02:02:46 AM
Support for old Windows7/8(64bit) restored.
WebView runtime will be installed automatically if not present.

Minor fixes.
#3
Versions - Release History / PDC core 2828 important hotfix
Last post by Alexei B. - January 26, 2025, 01:24:01 PM
Includes an important fix for one search engines not working.

Known problem: will not work on Win7/8 without necessary runtime present.
#4
Versions - Release History / PDC core version 2114 major re...
Last post by Alexei B. - March 05, 2023, 06:57:13 PM
Important
- New Search Engine link extraction algorithms to ensure compatibility with latest changes
- UACE (Universal  Anti-Cheat Engine) will now detect attempts to hide Plagiarism behind using homoglyphs (like Cyrillic а in English documents) and other such cheating attempts.

Many-many minor changes.

Due to the significance on the SE link extraction update, versions below 2114 will be prompted to update.
#5
It is important to remember, that Plagiarism Detector's reports are to be properly reviewed and assessed by a human before making any final decisions on the level of Plagiarism and/or coincidental similarities present in a document. Not everything similarity detected by a program is intentional Plagiarism!

When reviewing a report myself, I usually pay attention to the following:
1. Was the proper preset used? (https://plagiarism-detector.com/smf_bb/index.php/topic,337.0.html)
2. How much Plagiarism is reported? After some practice you will get a feeling of a usual level of detection for checked documents. In my practice a document with 0% detection looks more suspicious than a document with some 5-10% (for example). Due to the nature of any language it is unlikely to have a text that has no similarities to other texts online.
3. How it is distributed within the document: big solid chunks are more likely to be real Plagiarism than short distributed sections "6 words there, 7 words there".
4. If those detected sections are any kind of set word sequences: names of organizations, books, literature or source mentions, etc.
5. Sources reported by the program. If there are a few sources with high similarity percentage reported - I will check them using the side-by-side comparison feature. If there are no "primary" sources - sources with low overall similarity score are more likely to be coincidental detections.
6. Are References properly used?
7. Are there any problems with possible resources listed. Things to note will include, for example, a very short list of possible resources or a high number of failed resources.


The criteria each client uses may be unique and take into consideration many things, including language, knowledge domain, Institution's requirements and type of document, so we don't push any kind "one rule fits all" solution in this regard. Instead we recommend to take you time to analyze you reports, use Side-By-Side Comparison feature to determine if a reported similarity is an intentional Plagiarism or not. After some practice it will be easy to split our reports that require special attention from those you can consider "clean".

You can always contact our support service to get qualified help in reviewing your reports.
#6
"Silver Bullets" / TEEC: Text Extraction Engines ...
Last post by Mike Sanders - September 13, 2021, 04:38:26 PM
todo
#7
"Silver Bullets" / TEEC: Text Extraction Engines ...
Last post by Mike Sanders - September 13, 2021, 04:37:15 PM
todo
#8
"Silver Bullets" / Sensitivity settings
Last post by Mike Sanders - September 12, 2021, 09:15:00 PM
This setting allows you to change the minimum allowed size  for a Plagiarized section, in characters.

Helps when you want to avoid short detections that you consider false-positive results.
#9
"Silver Bullets" / UACE - UniCode Anti Cheating E...
Last post by Mike Sanders - September 12, 2021, 09:03:49 PM
UniCode Anti Cheating Engine is aimed at detecting/prevent cheating with homoglyphs. Different alphabets have similarly-looking characters, that are sometimes used to avoid Plagiarism detection. For example: Latin "a" looks similar to Cyrillic "а", but a program considers these different letters.

UACE detects the use of symbols, that are not expected for a given language and will attempt normalizing them to enable the check.

"Warn if % is above" allows you to change the warning  threshold. Warning is present both in reports and in the list of reports in Advanced Reports Viewer.
"Normalize if is above" - changes the normalization threshold. If the % of such symbols is above, the program will replace "foreign" symbols with homoglyphs from the expected alphabet.

Example of a valid case, when it is actually expected to have "foreign" symbols in a document, is when citing books or papers in other languages.

UACE section is a report includes status for the feature (was in on or off), detection percentage before and after  normalization, automatically generated assessment recommendation and  allows to view the full list of all characters present in the document with unicode descriptions.
#10
General Talks about Plagiarism / Re: Your Intellectual Property...
Last post by Alexei B. - September 06, 2021, 12:04:57 PM
Over the years since the initial discussion some additional features have been added to the software that require more data uploaded to our server. To keep things transparent I'd list some important changes here.
The main thing remains the same: no document checked with the Internet check of Plagiarism Detector is stored in any database to run checks against in the future.

Some cases when additional information, the file itself or your report is uploaded to our servers:
1. Text cannot be extracted from a document on your side (in case all local text extractors failed). The file is uploaded to our servers for text extraction and is deleted as soon as the task is completed.
2. Report to PDF export - this feature is working server-side, so any report exported to PDF is uploaded to our servers and the resulting PDF is then downloaded from our servers. The PDF will be available on our servers for a few hours and then deleted. Only the program that uploaded the file has the link to the PDF.
3. Report-sharing - if a client decides to share a report online - it will be stored on our servers for a long time and available by the link the client shares. These reports can only be accessible via the link provided, so if you share the link - be ready for search engines to index it!
4. We store statistical information on reports, which may include options used, file names and check result percentages (but not the text!). These are only used internally to detect possible problems and otherwise improve the program.
5. We consider the option to sometimes temporarily, anonymously and securely save clients' documents or reports for internal development or testing needs only. As of the moment of writing we don't have such on option. Some features we'd like to add to the program require a lot of real-world material to implement. For example we need documents with obfuscation cheating suspected to counter it, or documents for the program to start differentiating between "various kinds of Plagiarism".

If you have any concerns regarding any of these points - feel free to contact our support service in this regard.