Visible to the public NO-DOUBT: Attack Attribution Based On Threat Intelligence Reports

TitleNO-DOUBT: Attack Attribution Based On Threat Intelligence Reports
Publication TypeConference Paper
Year of Publication2019
AuthorsPerry, Lior, Shapira, Bracha, Puzis, Rami
Conference Name2019 IEEE International Conference on Intelligence and Security Informatics (ISI)
Date PublishedJuly 2019
PublisherIEEE
ISBN Number978-1-7281-2504-6
Keywordsattack attribution, attribution, classification, composability, compositionality, feature extraction, Human Behavior, Information Reuse and Security, invasive software, learning (artificial intelligence), machine learning, machine learning algorithms, Malware, Metrics, natural language processing, NLP, pubcrawl, Resiliency, security, security analytics, security literature, Task Analysis, Text, text analysis, text representation algorithm, threat actors, threat intelligence, threat intelligence reports, Training
Abstract

The task of attack attribution, i.e., identifying the entity responsible for an attack, is complicated and usually requires the involvement of an experienced security expert. Prior attempts to automate attack attribution apply various machine learning techniques on features extracted from the malware's code and behavior in order to identify other similar malware whose authors are known. However, the same malware can be reused by multiple actors, and the actor who performed an attack using a malware might differ from the malware's author. Moreover, information collected during an incident may contain many clues about the identity of the attacker in addition to the malware used. In this paper, we propose a method of attack attribution based on textual analysis of threat intelligence reports, using state of the art algorithms and models from the fields of machine learning and natural language processing (NLP). We have developed a new text representation algorithm which captures the context of the words and requires minimal feature engineering. Our approach relies on vector space representation of incident reports derived from a small collection of labeled reports and a large corpus of general security literature. Both datasets have been made available to the research community. Experimental results show that the proposed representation can attribute attacks more accurately than the baselines' representations. In addition, we show how the proposed approach can be used to identify novel previously unseen threat actors and identify similarities between known threat actors.

URLhttps://ieeexplore.ieee.org/document/8823152
DOI10.1109/ISI.2019.8823152
Citation Keyperry_no-doubt_2019