Visible to the public Biblio

Filters: Keyword is Plagiarism  [Clear All Filters]
2023-05-12
Huang, Song, Yang, Zhen, Zheng, Changyou, Wang, Yang, Du, Jinhu, Ding, Yixian, Wan, Jinyong.  2022.  Intellectual Property Right Confirmation System Oriented to Crowdsourced Testing Services. 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS). :64–68.

In the process of crowdsourced testing service, the intellectual property of crowdsourced testing has been faced with problems such as code plagiarism, difficulties in confirming rights and unreliability of data. Blockchain is a decentralized, tamper-proof distributed ledger, which can help solve current problems. This paper proposes an intellectual property right confirmation system oriented to crowdsourced testing services, combined with blockchain, IPFS (Interplanetary file system), digital signature, code similarity detection to realize the confirmation of crowdsourced testing intellectual property. The performance test shows that the system can meet the requirements of normal crowdsourcing business as well as high concurrency situations.

2022-09-09
Saini, Anu, Sri, Manepalli Ratna, Thakur, Mansi.  2021.  Intrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN. 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS). :13—18.
Plagiarism is the act of using someone else’s words or ideas without giving them due credit and representing it as one’s own work. In today's world, it is very easy to plagiarize others' work due to advancement in technology, especially by the use of the Internet or other offline sources such as books or magazines. Plagiarism can be classified into two broad categories on the basis of detection namely extrinsic and intrinsic plagiarism. Extrinsic plagiarism detection refers to detecting plagiarism in a document by comparing it against a given reference dataset, whereas, Intrinsic plagiarism detection refers to detecting plagiarism with the help of variation in writing styles without using any reference corpus. Although there are many approaches which can be adopted to detect extrinsic plagiarism, few are available for intrinsic plagiarism detection. In this paper, a simplified approach is proposed for developing an intrinsic plagiarism detector which is helpful in detecting plagiarism even when no reference corpus is available. The approach deals with development of an intrinsic plagiarism detection system by identifying the writing style of authors in the document using stylometric features and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering. The proposed system has an easy to use interactive interface where user has to upload a text document to be checked for plagiarism and the result is displayed on the web page itself. In addition, the user can also see the analysis of the document in the form of graphs.
White, Riley, Sprague, Nathan.  2021.  Deep Metric Learning for Code Authorship Attribution and Verification. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). :1089—1093.
Code authorship identification can assist in identifying creators of malware, identifying plagiarism, and giving insights in copyright infringement cases. Taking inspiration from facial recognition work, we apply recent advances in metric learning to the problem of authorship identification and verification. The metric learning approach makes it possible to measure similarity in the learned embedding space. Access to a discriminative similarity measure allows for the estimation of probability distributions that facilitate open-set classification and verification. We extend our analysis to verification based on sets of files, a previously unexplored problem domain in large-scale author identification. On closed-set tasks we achieve competitive accuracies, but do not improve on the state of the art.
Khan, Aazar Imran, Jain, Samyak, Sharma, Purushottam, Deep, Vikas, Mehrotra, Deepti.  2021.  Stylometric Analysis of Writing Patterns Using Artificial Neural Networks. 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT). :29—35.
Plagiarism checkers have been widely used to verify the authenticity of dissertation/project submissions. However, when non-verbatim plagiarism or online examinations are considered, this practice is not the best solution. In this work, we propose a better authentication system for online examinations that analyses the submitted text's stylometry for a match of writing pattern of the author by whom the text was submitted. The writing pattern is analyzed over many indicators (i.e., features of one's writing style). This model extracts 27 such features and stores them as the writing pattern of an individual. Stylometric Analysis is a better approach to verify a document's authorship as it doesn't check for plagiarism, but verifies if the document was written by a particular individual and hence completely shuts down the possibility of using text-convertors or translators. This paper also includes a brief comparative analysis of some simpler algorithms for the same problem statement. These algorithms yield results that vary in precision and accuracy and hence plotting a conclusion from the comparison shows that the best bet to tackle this problem is through Artificial Neural Networks.
2021-10-12
Deng, Perry, Linsky, Cooper, Wright, Matthew.  2020.  Weaponizing Unicodes with Deep Learning -Identifying Homoglyphs with Weakly Labeled Data. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–6.
Visually similar characters, or homoglyphs, can be used to perform social engineering attacks or to evade spam and plagiarism detectors. It is thus important to understand the capabilities of an attacker to identify homoglyphs - particularly ones that have not been previously spotted - and leverage them in attacks. We investigate a deep-learning model using embedding learning, transfer learning, and augmentation to determine the visual similarity of characters and thereby identify potential homoglyphs. Our approach uniquely takes advantage of weak labels that arise from the fact that most characters are not homoglyphs. Our model drastically outperforms the Normal-ized Compression Distance approach on pairwise homoglyph identification, for which we achieve an average precision of 0.97. We also present the first attempt at clustering homoglyphs into sets of equivalence classes, which is more efficient than pairwise information for security practitioners to quickly lookup homoglyphs or to normalize confusable string encodings. To measure clustering performance, we propose a metric (mBIOU) building on the classic Intersection-Over-Union (IOU) metric. Our clustering method achieves 0.592 mBIOU, compared to 0.430 for the naive baseline. We also use our model to predict over 8,000 previously unknown homoglyphs, and find good early indications that many of these may be true positives. Source code and list of predicted homoglyphs are uploaded to Github: https://github.com/PerryXDeng/weaponizing\_unicode.
2020-07-30
Zhang, Jin, Jin, Dahai, Gong, Yunzhan.  2018.  File Similarity Determination Based on Function Call Graph. 2018 IEEE International Conference on Electronics and Communication Engineering (ICECE). :55—59.
The similarity detection of the program has important significance in code reuse, plagiarism detection, intellectual property protection and information retrieval methods. Attribute counting methods cannot take into account program semantics. The method based on syntax tree or graph structure has a very high construction cost and low space efficiency. So it is difficult to solve problems in large-scale software systems. This paper uses different decision strategies for different levels, then puts forward a similarity detection method at the file level. This method can make full use of the features of the program and take into account the space-time efficiency. By using static analysis methods, we get function features and control flow features of files. And based on this, we establish the function call graph. The similar degree between two files can be measured with the two graphs. Experimental results show the method can effectively detect similar files. Finally, this paper discusses the direction of development of this method.
Holland, Martin, Stjepandić, Josip, Nigischer, Christopher.  2018.  Intellectual Property Protection of 3D Print Supply Chain with Blockchain Technology. 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC). :1—8.
Within “Industrie 4.0” approach 3D printing technology is characterized as one of the disruptive innovations. Conventional supply chains are replaced by value-added networks. The spatially distributed development of printed components, e.g. for the rapid delivery of spare parts, creates a new challenge when differentiating between “original part”, “copy” or “counterfeit” becomes necessary. This is especially true for safety-critical products. Based on these changes classic branded products adopt the characteristics of licensing models as we know them in the areas of software and digital media. This paper describes the use of digital rights management as a key technology for the successful transition to Additive Manufacturing methods and a key for its commercial implementation and the prevention of intellectual property theft. Risks will be identified along the process chain and solution concepts are presented. These are currently being developed by an 8-partner project named SAMPL (Secure Additive Manufacturing Platform).
2020-05-22
Khadilkar, Kunal, Kulkarni, Siddhivinayak, Bone, Poojarani.  2018.  Plagiarism Detection Using Semantic Knowledge Graphs. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). :1—6.

Every day, huge amounts of unstructured text is getting generated. Most of this data is in the form of essays, research papers, patents, scholastic articles, book chapters etc. Many plagiarism softwares are being developed to be used in order to reduce the stealing and plagiarizing of Intellectual Property (IP). Current plagiarism softwares are mainly using string matching algorithms to detect copying of text from another source. The drawback of some of such plagiarism softwares is their inability to detect plagiarism when the structure of the sentence is changed. Replacement of keywords by their synonyms also fails to be detected by these softwares. This paper proposes a new method to detect such plagiarism using semantic knowledge graphs. The method uses Named Entity Recognition as well as semantic similarity between sentences to detect possible cases of plagiarism. The doubtful cases are visualized using semantic Knowledge Graphs for thorough analysis of authenticity. Rules for active and passive voice have also been considered in the proposed methodology.

2019-06-10
Xue, S., Zhang, L., Li, A., Li, X., Ruan, C., Huang, W..  2018.  AppDNA: App Behavior Profiling via Graph-Based Deep Learning. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :1475-1483.

Better understanding of mobile applications' behaviors would lead to better malware detection/classification and better app recommendation for users. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The behavior difference between two apps can be measured by the distance between their representations. As a result, the versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorizing, plagiarism detection, etc. Based on a systematic and deep understanding of an app's behavior, we propose to perform a function-call-graph-based app profiling. We carefully design a graph-encoding method to convert a typically extremely large call-graph to a 64-dimension fix-size vector to achieve robust app profiling. Our extensive evaluations based on 86,332 benign and malicious apps demonstrate that our system performs app profiling (thus malware detection, classification, and app recommendation) to a high accuracy with extremely low computation cost: it classifies 4024 (benign/malware) apps using around 5.06 second with accuracy about 93.07%; it classifies 570 malware's family (total 21 families) using around 0.83 second with accuracy 82.3%; it classifies 9,730 apps' functionality with accuracy 33.3% for a total of 7 categories and accuracy of 88.1 % for 2 categories.

2019-02-08
Ivanova, M., Durcheva, M., Baneres, D., Rodríguez, M. E..  2018.  eAssessment by Using a Trustworthy System in Blended and Online Institutions. 2018 17th International Conference on Information Technology Based Higher Education and Training (ITHET). :1-7.

eAssessment uses technology to support online evaluation of students' knowledge and skills. However, challenging problems must be addressed such as trustworthiness among students and teachers in blended and online settings. The TeSLA system proposes an innovative solution to guarantee correct authentication of students and to prove the authorship of their assessment tasks. Technologically, the system is based on the integration of five instruments: face recognition, voice recognition, keystroke dynamics, forensic analysis, and plagiarism. The paper aims to analyze and compare the results achieved after the second pilot performed in an online and a blended university revealing the realization of trust-driven solutions for eAssessment.

2018-03-19
Shahid, U., Farooqi, S., Ahmad, R., Shafiq, Z., Srinivasan, P., Zaffar, F..  2017.  Accurate Detection of Automatically Spun Content via Stylometric Analysis. 2017 IEEE International Conference on Data Mining (ICDM). :425–434.

Spammers use automated content spinning techniques to evade plagiarism detection by search engines. Text spinners help spammers in evading plagiarism detectors by automatically restructuring sentences and replacing words or phrases with their synonyms. Prior work on spun content detection relies on the knowledge about the dictionary used by the text spinning software. In this work, we propose an approach to detect spun content and its seed without needing the text spinner's dictionary. Our key idea is that text spinners introduce stylometric artifacts that can be leveraged for detecting spun documents. We implement and evaluate our proposed approach on a corpus of spun documents that are generated using a popular text spinning software. The results show that our approach can not only accurately detect whether a document is spun but also identify its source (or seed) document - all without needing the dictionary used by the text spinner.