Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Alexei B.

Pages: 1 [2] 3 4 ... 6
16
Versions - Release History / PDC core versions 914-915
« on: March 25, 2016, 03:29:00 PM »
- Important: Fixed a problem, when clustering distance was the same for both presets (Arts&Sciences). Results from this version and on will be different from previous versions ad are to be more exact. Any feedback is appreciated.

- New long-awaited feature added for testing purposes: saving reports to PDF ! It will be in testing for some time and we hope it will be then added on a regular basis. Reports are uploaded to our servers for conversion and you then download the result.

17
Versions - Release History / PDC core versions 911-912
« on: March 22, 2016, 11:22:13 PM »
Mostly ARV changes:
- Added button to save the list of reports to CSV
- Added button to save the report to HTML file to a desired location
- Re-worked reports so that a notification is seen, when an incorrectly re-saved report is opened.
- Slightly changed coloring for reports with lots of failed resources (planning to divide fails due to a client computer from those generated by servers)
- New graph in reports, mouseover highlights top source for a selected section.
- Added a new panel below and changed elements distribution. More elements to come.

18
"Silver Bullets" / Re: Resources Error Codes
« on: March 18, 2016, 03:55:42 PM »
New information added to the report for failed resources:
1. Browser error message now displayed.
RDF
Resource Download Failed. Means that the resource was not downloaded and thus was not analyzed.

GRF
Get Response Failed. Means the resource was not downloaded because the site had not provided the response to the request.

19
"Plagiarism Detector" - Bugs\Errors\Crashes / Re: Problem with V895
« on: December 02, 2015, 04:53:57 AM »
Quote
Another recommendation would be to unlock the ability for a customer to use the previous version of the software

This would have been done, if it were possible. It was not our wish to force updating to the latest version, but the need. Due to some third party changes, versions 850 and bellow started providing totally inadequate results (very poor detection), thus we were to force-update our clients.

T-comparator presets are not so easily changed. Presets currently used are based on the algorithms developed for PAN'14 conference, plagiarism detection task, and showed one of the best results with the used document corpus. One of the lines of our research now is what is different between that corpus and the "real-life" cases, which can be closely connected to the problem you report.

Changing the preset in the current system requires some research, as we need to be sure the new preset works better. At present we have started additional tests on the false-positive detection and upon the results we will both post here and may start making the needed alterations.

20
"Plagiarism Detector" - Bugs\Errors\Crashes / Re: Problem with V895
« on: November 21, 2015, 11:25:49 PM »
Like you were informed, we will consider the detection threshold for the Arts. However as of now we don't see enough reason to change it - this preset is aimed at detecting obfuscated plagiarism and our tests showed it deals with this problem just fine.

We need enough evidence of false-positives with this preset to change it. We are ready to analyze each such case and we have some of them stored already. But the number of such cases is not yet enough to start such changes.

We recommend to use word-to-word preset when it is necessary to avoid false-positives.

21
"Plagiarism Detector" general discussion / Re: license Difference
« on: November 04, 2015, 01:03:36 PM »
Thank you for your interest in our software!

Lite is the most basic license. We are now reviewing differences between Lite and Personal licenses, so these will be different in the next version.

Pro - this license gives access to all features, and besides provides licenses for two computers.

Portable - this gives additional license to be used from your USB/Flash drive.

All the prices are available from our site: http://plagiarism-detector.com/plagiarism-detector-buy-now.php

Feel free to ask any additional questions.

22
Versions - Release History / PDC core versions 888-889 - major update
« on: September 24, 2015, 03:43:16 PM »
Plagiarism Detector major update with core version 888:

Features:

01. Added check type selection window to the Step-by-Step Wizard. More details here:  http://www.plagiarism-detector.com/smf_bb/index.php/topic,337.
02. Added failed resources notification. More details here: http://www.plagiarism-detector.com/smf_bb/index.php/topic,48.0.html
03. New T-comparator is now used for comparing documents. It is more precise, yet creates more load on CPU and memory. Due to this change, results from previous versions can differ from current results.
04. Changed license information text on the mains screen to avoid misunderstanding

Bugfixes:
1. Fixed an issue, resulting from third-party changes, that made previous versions returning "no plagiarism" results for most document. Force-updating from previous versions due to this fix.

--
We are highly looking towards your feedback!

23
"Silver Bullets" / Re: Check Type: Word-to-Word VS Re-Written
« on: September 18, 2015, 05:03:31 PM »
The Exact Sciences.


Text in these fields of knowledge showed certain features of their own, making the above-mentioned obfuscated Plagiarism detection algorithms unacceptable on many cases. Texts in Physics, Maths, etc. usually are much less flexible and enjoy a massive use of domain-specific constructions and expressions that are similar to many texts from the same domain of knowledge. One of the best examples was a certain medical prescription, which was considered Plagiarized upon checking. However a manual check did not confirm it. It turned out that most (if not all) of prescriptions use the same structure of the text as well as the same words and expressions. It is just the components, that change.

Let us take this example from Wikipedia:
“Take of pentobarbitone sodium, three grammes
of sulphate of morphia, two grammes
of hydrate of chloral, fifteen grammes
of table sugar, enough to make fifty grammes.”
And now let’s toss the ingredients randomly:
“Take of hydrate of chloral, three grammes
of pentobarbitone sodium, two grammes
of sulphate of morphia, fifty grammes
of table sugar, enough to make fifteen grammes.”

And now remember the example from the Arts section that was to be detected as Plagiarism. It is rather evident, that due to the same language used these parts will be considered the “same” text, that was obfuscated by changing the word order (one of the approached to obfuscation).

Sure, it is an error, one that we call false-positive. Errors of this kind are usual for all the Plagiarism detection algorithms that are aimed at detecting obfuscated Plagiarism.

Having it in mind, we modified the algorithm specifically for such texts, to detect only what we call “word-to-word” Plagiarism. This algorithm will correctly detect this “prescription” as two different texts, but will also detect those “Arts” example as different texts.

So this “word-to-word” Plagiarism Detection algorithm has the following features:
-   Detects only similar parts of texts
-   Prevents false-positive results
-   Usually shows less Plagiarism then a regular algorithm
-   Is bad at finding even slightly obfuscated Plagiarism

In the recent years we have had several versions of Plagiarism Detector, using this algorithm, and they were provided to customers, that required this kind of check. However having two very different versions is not what we see right, so our RnD spent much time on incorporating both algorithms into a single software!


24
"Silver Bullets" / Re: Check Type: Word-to-Word VS Re-Written
« on: September 18, 2015, 05:02:28 PM »
The Arts.

Texts in these subjects are very flexible in nature and allow much modification without actually changing the meaning. Any analysis of a piece of literature is a good example to it. To detect Plagiarism in a best possible way a software has to detect obfuscated “re-written” cases of Plagiarism – when sentences are modified (manually or automatically) to keep the meaning, but avoid detection. We have multiple cases of such modified documents, provided by our customers at different times, which shows some students’ struggle to avoid Plagiarism detection. For example the sentence “It was a need for him to have the computer fixed” is better be detected as similar to “he must have had the PC fixed”. Please note: these examples are hypothetical and very simplified, the algorithm is much more complex and this pair can be detected or not, depending on the context.

Such approach to Plagiarism detection is perceived to be better not only by us, but also by many competitors, and that is due to several advantages:
-   Obfuscated Plagiarism detection
-   More Plagiarism detected – users often compare software by the detection percent for the same document

That is why it has usually been a default setting for our software.
However, this approach was found to have a significant drawback:
-   False-positive results for certain documents (see below)

25
"Silver Bullets" / Check Type: Word-to-Word VS Re-Written
« on: September 18, 2015, 12:32:09 AM »
Plagiarism Detector team is proud to announce the absolutely awesome feature added to Plagiarism Detector – presets for different kinds of documents are now available “out of the box”!

Some theory behind this:
It has been some time, since we first observed cases, when our usual Plagiarism detection algorithms provided unsatisfactory results for certain documents. Additional research highlighted some common features for such cases. These are the specific features of the text in the Arts subjects and Exact Sciences subjects, as we started to call them. While the language for these documents is the same, they usually have some very interesting characteristics, which require different attitude from the Plagiarism detection algorithms.

So, starting with version 885, a user can select the preferable algorithm of check in the Step-by-Step Wizard! Please select “detect Text Rewrite (maximum detection)” if you documents are in the Arts subjects or other similar, and “detect Word-to-Word (maximum exactness)” if your documents are in the Exact Sciences’ fields.

We really believe that this new feature (that we have never seen before elsewhere) will help our customers in the ever-ongoing struggle against copy-paste!


26
We are aware of the false-positive Norton alert about the setup-file (at the time of posting, v885).

You are most likely to see the WS.Reputation.1 as the alert reason. It looks like Norton react to our file, as we have released a new version and their anti-virus software is not "familiar" with it.

You can read more on the problem here:
http://community.norton.com/forums/clarification-wsreputation1-detection

There should be an option to remove the file from Quarantine, which would allow you to use it.

We are sure the install package for our software does not have any virus inside, as the online-service confirms:
https://www.virustotal.com/en/file/e500af153be86bd6ce305f1440f04f183a38eff8b059b04bcae01937dd59c4eb/analysis/

We see it possible later setups can also be falsely detected, so client are encouraged to use VirusTotal for checks.

We hope it helps.

27
Just discussed the situation with RnD.

1. Some kind of detailed setting to the T-Comparator is planned for the future.

2. Please provide a pair of reports (old-new) with a detailed explanation to what is wrong to our support e-mail (check forum PM). Our RnD requests that a problem is described so that we can be sure that "this" is what you are writing about. Screenshots may help much.

Any user-provided materials are used for internal testing only.

28
The latest version uses a different detection algorithm, with this setting no longer available - to change the mode of detection we now need to alter not just one thing. This algorithm is much better in general, so we moved to it.

In this version we keep two possible settings: for re-written and for word-to-word Plagiarism. I will soon post a detailed explanation as to these, but in general: re-written is close to "maximum detection" on the previous scale, "word-to-word" - maximum exactness. These settings are results to extensive tests on big collections of Plagiarized  pieces and provided best results we could reach in the recent years.

We are sorry to hear this was so important to you and I will ask our RnD about the possibility to return some manual settings in future.

Force updating to the current version was needed due to some changes on the third side, that made previous version worthless.

P.S. Any other users in need of the manual settings - please inform us either in this topic, or e-mail to our support. The more people need it - the more urgent will be the task.

29
Versions before 874 have a problem with occasional license loss, somehow connected with USB-storages used. If you need to have your license re-initialized - please send us an e-mail, mentioning your license number.

30
We switched back from v874 to v849(850 is almost similar) due to a negative feedback.

When you need your license re-initialized (new OS/PC) - please send us an e-mail, mentioning your license code.

Pages: 1 [2] 3 4 ... 6