CHECK: A document plagiarism detection system

Antonio Si, Hong Va Leong, Rynson W.H. Lau

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

108 Citations (Scopus)

Abstract

Digital documents are vulnerable to being copied. Most existing copy detection prototypes employ an exhaustive sentence-based comparison method in comparing a potential plagiarized document against a repository of legal or original documents to identify plagiarism activities. This approach is not scalable due to the potentially large number of original documents and the large number of sentences in each document. Furthermore, the security level of existing mechanisms is quite weak; a plagiarized document could simply by-pass the detection mechanisms by performing a minor modification on each sentence. In this paper, we propose a copy detection mechanism that will eliminate unnecessary comparisons. This is based on the observation that comparisons between two documents addressing different subjects are not necessary. We describe the design and implementation of our experimental prototype called CHECK. The results of some exploratory experiments will be illustrated and the security level of our mechanism will be discussed.
Original languageEnglish
Title of host publicationProceedings of the 1997 ACM Symposium on Applied Computing, SAC 1997
PublisherAssociation for Computing Machinery
Pages70-77
Number of pages8
ISBN (Print)0897918509, 9780897918503
DOIs
Publication statusPublished - 1 Jan 1997
Event1997 ACM Symposium on Applied Computing, SAC 1997 - San Jose, CA, United States
Duration: 28 Feb 19971 Mar 1997

Conference

Conference1997 ACM Symposium on Applied Computing, SAC 1997
CountryUnited States
CitySan Jose, CA
Period28/02/971/03/97

Keywords

  • Copy detection
  • Digital libraries
  • Document plagiarism
  • Information retrieval

ASJC Scopus subject areas

  • Software

Cite this