Title | Machine Learning Approaches for Authorship Attribution using Source Code Stylometry |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Frankel, Sophia F., Ghosh, Krishnendu |
Conference Name | 2021 IEEE International Conference on Big Data (Big Data) |
Keywords | authorship attribution, Big Data, codes, Conferences, Deep Learning, feature extraction, Human Behavior, machine learning, Metrics, natural language processing, pubcrawl, source code, stylometry, Syntactics |
Abstract | Identification of source code authorship is vital for attribution. In this work, a machine learning framework is described to identify source code authorship. The framework integrates the features extracted using natural language processing based approaches and abstract syntax tree of the code. We evaluate the methodology on Google Code Jam dataset. We present the performance measures of the logistic regression and deep learning on the dataset. |
DOI | 10.1109/BigData52589.2021.9671332 |
Citation Key | frankel_machine_2021 |