Visible to the public Intrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN

TitleIntrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN
Publication TypeConference Paper
Year of Publication2021
AuthorsSaini, Anu, Sri, Manepalli Ratna, Thakur, Mansi
Conference Name2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)
KeywordsAuthor, Databases, DBSCAN, detection, Detectors, feature extraction, Human Behavior, Intrinsic, Metrics, Plagiarism, pubcrawl, Segments, stylometry, Tools, Web pages, Writing, writing style
AbstractPlagiarism is the act of using someone else's words or ideas without giving them due credit and representing it as one's own work. In today's world, it is very easy to plagiarize others' work due to advancement in technology, especially by the use of the Internet or other offline sources such as books or magazines. Plagiarism can be classified into two broad categories on the basis of detection namely extrinsic and intrinsic plagiarism. Extrinsic plagiarism detection refers to detecting plagiarism in a document by comparing it against a given reference dataset, whereas, Intrinsic plagiarism detection refers to detecting plagiarism with the help of variation in writing styles without using any reference corpus. Although there are many approaches which can be adopted to detect extrinsic plagiarism, few are available for intrinsic plagiarism detection. In this paper, a simplified approach is proposed for developing an intrinsic plagiarism detector which is helpful in detecting plagiarism even when no reference corpus is available. The approach deals with development of an intrinsic plagiarism detection system by identifying the writing style of authors in the document using stylometric features and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering. The proposed system has an easy to use interactive interface where user has to upload a text document to be checked for plagiarism and the result is displayed on the web page itself. In addition, the user can also see the analysis of the document in the form of graphs.
DOI10.1109/ICCCIS51004.2021.9397187
Citation Keysaini_intrinsic_2021