Visible to the public Using Stylometry to Attribute Programmers and Writers

TitleUsing Stylometry to Attribute Programmers and Writers
Publication TypeConference Paper
Year of Publication2017
AuthorsGreenstadt, Rachel
Conference NameProceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-5061-7
Keywordsattribute-based encryption, attribution, Collaboration, Human Behavior, human factors, machine learning, policy-based governance, privacy, pubcrawl, Scalability, stylometry
Abstract

In this talk, I will discuss my lab's work in the emerging field of adversarial stylometry and machine learning. Machine learning algorithms are increasingly being used in security and privacy domains, in areas that go beyond intrusion or spam detection. For example, in digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. We have applied stylometry to difficult domains such as underground hacker forums, open source projects (code), and tweets. I will discuss our Doppelgnger Finder algorithm, which enables us to group Sybil accounts on underground forums and detect blogs from Twitter feeds and reddit comments. In addition, I will discuss our work attributing unknown source code and binaries.

URLhttps://dl.acm.org/citation.cfm?doid=3082031.3092567
DOI10.1145/3082031.3092567
Citation Keygreenstadt_using_2017