Plagiarism Detector Community

General Category => "Silver Bullets" => Topic started by: Mike Sanders on September 15, 2013, 04:47:16 PM

Title: Text Extrcation Engines Configuration
Post by: Mike Sanders on September 15, 2013, 04:47:16 PM
Plagiarism Detector has several methods of Extracting Plain text from each supported file type.
(The List of supported file formats (,29.0.html))

For each file type has it's own Text Extraction Engines.

Text Extraction Engine (TEE) is a sub-program that extracts text from a specified file type.

By default, every time you start Document Manager - Plagiarism Detector automatically selects the most optimal TEE for each supported file type.

Still, there exist 2 cases, when you possibly need to change TEE to get better text extraction:
TEEs configuration made with Document Manager is global - it is used in all cases.
TEEs configuration made with Advanced Report Viewer (ARV) is local - it is used only within ARV session and then reseted to global setting.

To change a particular TEE for a specific file type, click to the right, combobox appears:
(The example below shows how to change TEE for DocX files)