The version 1.0 of our system was the best-performing at the first corpus and third best-performing at the second corpus in the Text Alignment task at the 2014 international competition PAN - Uncovering Plagiarism, Authorship, and Social Software Misuse. This version 2.0 of our system, which was improved since the version 1.0 (2014), participated in the PAN 2015 competition, but no winner was announced in that year.
The system is described in the following paper(s):
Any work that uses this data or software should cite the abovementioned paper(s).
License: free for non-commercial academic purposes. Any publication that benefited from these data or software must state the origin of the data and software and cite the abovementioned paper(s). We will be grateful to you if you let us know of the use of the data or software and of citing our papers. Any derived work should specify the original source and its authors and contain this license, including the publication references mentioned above. If you modify this corpus or software, correct errors in it, or add annotation/functionality to it, we will be grateful if you send us the new version, to be available from this site. See also individual license files or comments in the specific files, if any.
Version 2.0 (current):
This is the version that was submitted to PAN 2015 (there was no winner announced).
Download the Text Alignment System for Plagiarism Detection version 2.0: all files in one ZIP, separate files: license, readme, code.