Text Alignment System for Plagiarism Detection, version 1.0 (2014)

 Note: This version is superseded by the version 2.0.

This system was the best-performing at the first corpus and third best-performing at the second corpus in the Text Alignment task at the 2014 international competition PAN - Uncovering Plagiarism, Authorship, and Social Software Misuse.

The system is described in the following paper(s):

Miguel Sanchez-Perez, Grigori Sidorov, Alexander Gelbukh. The Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014. In: L. Cappellato, N. Ferro, M. Halvey, W. Kraaij (eds.). Notebook for PAN at CLEF 2014. CLEF 2014. CLEF2014 Working Notes. Sheffield, UK, September 15-18, 2014. CEUR Workshop Proceedings, ISSN 1613-0073, Vol. 1180, CEUR-WS.org, 2014, pp. 1004–1011.

(This paper can be also mistakenly indexed as “A Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014”.)

Abstract of the same paper published separately.

Any work that uses this data or software should cite the abovementioned paper(s).

License: free for non-commercial academic purposes. Any publication that benefited from these data or software must state the origin of the data and software and cite the abovementioned paper(s). We will be grateful to you if you let us know of the use of the data or software and of citing our papers. Any derived work should specify the original source and its authors and contain this license, including the publication references mentioned above. If you modify this corpus or software, correct errors in it, or add annotation/functionality to it, we will be grateful if you send us the new version, to be available from this site. See also individual license files or comments in the specific files, if any.

Version 1:

    This is the version that was submitted to PAN 2014 and showed the best result of the 11 participating systems.

    Download the Text Alignment System for Plagiarism Detection: all files in one ZIP, separate files: license, readme, code.

Next version:

    Version 2.0.

Previous versions:

None so far.