Recent Posts

Pages: [1] 2 3 ... 10
1
Versions - Release History / PDC core version 2114 major release
« Last post by Alexei B. on March 05, 2023, 08:57:13 PM »
Important
- New Search Engine link extraction algorithms to ensure compatibility with latest changes
- UACE (Universal  Anti-Cheat Engine) will now detect attempts to hide Plagiarism behind using homoglyphs (like Cyrillic а in English documents) and other such cheating attempts.

Many-many minor changes.

Due to the significance on the SE link extraction update, versions below 2114 will be prompted to update.
2
"Plagiarism Detector" - Bugs\Errors\Crashes / can't connect to PDAS
« Last post by cnbmcmlxix on November 16, 2021, 11:33:43 AM »
After updating PDC to 1991 I can't find the configuration window for PDAS. The "hide PDAS detection window on startup" setting is unchecked. How can PDC connect to PDAS if I can't configure the PDAS address? Any solution?
Thank you.

Edit: I managed to get the configuration window to show up and successfully connected to PDAS, but, now i'm getting "PDAS error: search result polling timeout!". Please help solving this.
Thank you.
3
"Plagiarism Detector Accumulator Server" (PDAS) issues / raw_file_size_exceeded
« Last post by cnbmcmlxix on November 16, 2021, 10:56:21 AM »
raw_file_size_exceeded. This is the error I get when importing some files in the PDAS database. I got this error on 142 docs out of a total of 2089 that I imported. Is there a solution or a workaround for this?
Thank you.
4
"Plagiarism Detector" - Best Practices / Report review and analysis recomendations
« Last post by Alexei B. on September 21, 2021, 07:19:25 PM »
It is important to remember, that Plagiarism Detector's reports are to be properly reviewed and assessed by a human before making any final decisions on the level of Plagiarism and/or coincidental similarities present in a document. Not everything similarity detected by a program is intentional Plagiarism!

When reviewing a report myself, I usually pay attention to the following:
1. Was the proper preset used? (https://plagiarism-detector.com/smf_bb/index.php/topic,337.0.html)
2. How much Plagiarism is reported? After some practice you will get a feeling of a usual level of detection for checked documents. In my practice a document with 0% detection looks more suspicious than a document with some 5-10% (for example). Due to the nature of any language it is unlikely to have a text that has no similarities to other texts online.
3. How it is distributed within the document: big solid chunks are more likely to be real Plagiarism than short distributed sections "6 words there, 7 words there".
4. If those detected sections are any kind of set word sequences: names of organizations, books, literature or source mentions, etc.
5. Sources reported by the program. If there are a few sources with high similarity percentage reported - I will check them using the side-by-side comparison feature. If there are no "primary" sources - sources with low overall similarity score are more likely to be coincidental detections.
6. Are References properly used?
7. Are there any problems with possible resources listed. Things to note will include, for example, a very short list of possible resources or a high number of failed resources.


The criteria each client uses may be unique and take into consideration many things, including language, knowledge domain, Institution's requirements and type of document, so we don't push any kind "one rule fits all" solution in this regard. Instead we recommend to take you time to analyze you reports, use Side-By-Side Comparison feature to determine if a reported similarity is an intentional Plagiarism or not. After some practice it will be easy to split our reports that require special attention from those you can consider "clean".

You can always contact our support service to get qualified help in reviewing your reports.
5
"Silver Bullets" / TEEC: Text Extraction Engines Config for Resources
« Last post by Mike Sanders on September 13, 2021, 07:38:26 PM »
todo
6
"Silver Bullets" / TEEC: Text Extraction Engines Config for Input Documents
« Last post by Mike Sanders on September 13, 2021, 07:37:15 PM »
todo
7
"Silver Bullets" / Sensitivity settings
« Last post by Mike Sanders on September 13, 2021, 12:15:00 AM »
This setting allows you to change the minimum allowed size  for a Plagiarized section, in characters.

Helps when you want to avoid short detections that you consider false-positive results.
8
"Silver Bullets" / UACE - UniCode Anti Cheating Engine
« Last post by Mike Sanders on September 13, 2021, 12:03:49 AM »
UniCode Anti Cheating Engine is aimed at detecting/prevent cheating with homoglyphs. Different alphabets have similarly-looking characters, that are sometimes used to avoid Plagiarism detection. For example: Latin "a" looks similar to Cyrillic "а", but a program considers these different letters.

UACE detects the use of symbols, that are not expected for a given language and will attempt normalizing them to enable the check.

"Warn if % is above" allows you to change the warning  threshold. Warning is present both in reports and in the list of reports in Advanced Reports Viewer.
"Normalize if is above" - changes the normalization threshold. If the % of such symbols is above, the program will replace "foreign" symbols with homoglyphs from the expected alphabet.

Example of a valid case, when it is actually expected to have "foreign" symbols in a document, is when citing books or papers in other languages.

UACE section is a report includes status for the feature (was in on or off), detection percentage before and after  normalization, automatically generated assessment recommendation and  allows to view the full list of all characters present in the document with unicode descriptions.
9
Over the years since the initial discussion some additional features have been added to the software that require more data uploaded to our server. To keep things transparent I'd list some important changes here.
The main thing remains the same: no document checked with the Internet check of Plagiarism Detector is stored in any database to run checks against in the future.

Some cases when additional information, the file itself or your report is uploaded to our servers:
1. Text cannot be extracted from a document on your side (in case all local text extractors failed). The file is uploaded to our servers for text extraction and is deleted as soon as the task is completed.
2. Report to PDF export - this feature is working server-side, so any report exported to PDF is uploaded to our servers and the resulting PDF is then downloaded from our servers. The PDF will be available on our servers for a few hours and then deleted. Only the program that uploaded the file has the link to the PDF.
3. Report-sharing - if a client decides to share a report online - it will be stored on our servers for a long time and available by the link the client shares. These reports can only be accessible via the link provided, so if you share the link - be ready for search engines to index it!
4. We store statistical information on reports, which may include options used, file names and check result percentages (but not the text!). These are only used internally to detect possible problems and otherwise improve the program.
5. We consider the option to sometimes temporarily, anonymously and securely save clients' documents or reports for internal development or testing needs only. As of the moment of writing we don't have such on option. Some features we'd like to add to the program require a lot of real-world material to implement. For example we need documents with obfuscation cheating suspected to counter it, or documents for the program to start differentiating between "various kinds of Plagiarism".

If you have any concerns regarding any of these points - feel free to contact our support service in this regard.
10
FAQ - Frequently Asked Questions / OR Section: Check Type
« Last post by Mike Sanders on October 13, 2020, 09:03:39 PM »
Currently Plagiarism Detector Supports the following types of Plagiarism Checks:

1. Global Internet Check (includes  SciPap check).
2. Custom Database Check (PDAS) (Chek against your own PDAS Database, that allows to store very large amounts of documents).
3. Combined Check [Internet + PDAS Database].
4. Check against a Local Folder
5. Check 2 Documents against each other.
6. Check against our internal scientific papers Database (SciPap).
Pages: [1] 2 3 ... 10