Plagiarism Detector Community

General Category => "Silver Bullets" => Topic started by: Mike Sanders on September 15, 2013, 04:47:16 PM

Title: Text Extrcation Engines Configuration
Post by: Mike Sanders on September 15, 2013, 04:47:16 PM
Plagiarism Detector has several methods of Extracting Plain text from each supported file type.
(The List of supported file formats (http://www.plagiarism-detector.com/smf_bb/index.php/topic,29.0.html))

For each file type has it's own Text Extraction Engines.

Text Extraction Engine (TEE) is a sub-program that extracts text from a specified file type.

By default, every time you start Document Manager - Plagiarism Detector automatically selects the most optimal TEE for each supported file type.

Still, there exist 2 cases, when you possibly need to change TEE to get better text extraction:
TEEs configuration made with Document Manager is global - it is used in all cases.
TEEs configuration made with Advanced Report Viewer (ARV) is local - it is used only within ARV session and then reseted to global setting.

To change a particular TEE for a specific file type, click to the right, combobox appears:
(The example below shows how to change TEE for DocX files)

(https://plagiarism-detector.com/smf_bb/_external_images/doc_selector_change_tee.png)