Plagiarism Detection Using Semantic Knowledge Graphs
Title | Plagiarism Detection Using Semantic Knowledge Graphs |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Khadilkar, Kunal, Kulkarni, Siddhivinayak, Bone, Poojarani |
Conference Name | 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) |
ISBN Number | 978-1-5386-5257-2 |
Keywords | data mining, Google, graph theory, Human Behavior, industrial property, intellectual property, knowledgegraphs, named entity recognition, natural language processing, nip, Plagiarism, plagiarism detection, pubcrawl, relations, resilience, Resiliency, Scalability, Semantic, semantic knowledge graphs, semantic similarity, Semantics, Software, string matching algorithms, text analysis, unstructured text |
Abstract | Every day, huge amounts of unstructured text is getting generated. Most of this data is in the form of essays, research papers, patents, scholastic articles, book chapters etc. Many plagiarism softwares are being developed to be used in order to reduce the stealing and plagiarizing of Intellectual Property (IP). Current plagiarism softwares are mainly using string matching algorithms to detect copying of text from another source. The drawback of some of such plagiarism softwares is their inability to detect plagiarism when the structure of the sentence is changed. Replacement of keywords by their synonyms also fails to be detected by these softwares. This paper proposes a new method to detect such plagiarism using semantic knowledge graphs. The method uses Named Entity Recognition as well as semantic similarity between sentences to detect possible cases of plagiarism. The doubtful cases are visualized using semantic Knowledge Graphs for thorough analysis of authenticity. Rules for active and passive voice have also been considered in the proposed methodology. |
URL | https://ieeexplore.ieee.org/document/8697404 |
DOI | 10.1109/ICCUBEA.2018.8697404 |
Citation Key | khadilkar_plagiarism_2018 |
- pubcrawl
- unstructured text
- text analysis
- string matching algorithms
- Software
- Semantics
- semantic similarity
- semantic knowledge graphs
- Semantic
- Scalability
- Resiliency
- resilience
- relations
- Data mining
- plagiarism detection
- Plagiarism
- nip
- natural language processing
- named entity recognition
- knowledgegraphs
- intellectual property
- industrial property
- Human behavior
- graph theory