Authorship Attribution Using Relative Compression
Title | Authorship Attribution Using Relative Compression |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Pinho, Armando J., Pratas, Diogo, Ferreira, Paulo J. S. G. |
Publisher | IEEE |
ISBN Number | 978-1-5090-1853-6 |
Keywords | attribution, composability, Human Behavior, Metrics, pubcrawl |
Abstract | Authorship attribution is a classical classification problem. We use it here to illustrate the performance of a compression-based measure that relies on the notion of relative compression. Besides comparing with recent approaches that use multiple discriminant analysis and support vector machines, we compare it with the Normalized Conditional Compression Distance (a direct approximation of the Normalized Information Distance) and the popular Normalized Compression Distance. The Normalized Relative Compression (NRC) attained 100% correct classification in the data set used, showing consistency between the compression ratio and the classification performance, a characteristic not always present in other compression-based measures. |
URL | http://ieeexplore.ieee.org/document/7786177/ |
DOI | 10.1109/DCC.2016.53 |
Citation Key | pinho_authorship_2016 |