NO-DOUBT: Attack Attribution Based On Threat Intelligence Reports
Title | NO-DOUBT: Attack Attribution Based On Threat Intelligence Reports |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Perry, Lior, Shapira, Bracha, Puzis, Rami |
Conference Name | 2019 IEEE International Conference on Intelligence and Security Informatics (ISI) |
Date Published | July 2019 |
Publisher | IEEE |
ISBN Number | 978-1-7281-2504-6 |
Keywords | attack attribution, attribution, classification, composability, compositionality, feature extraction, Human Behavior, Information Reuse and Security, invasive software, learning (artificial intelligence), machine learning, machine learning algorithms, Malware, Metrics, natural language processing, NLP, pubcrawl, Resiliency, security, security analytics, security literature, Task Analysis, Text, text analysis, text representation algorithm, threat actors, threat intelligence, threat intelligence reports, Training |
Abstract | The task of attack attribution, i.e., identifying the entity responsible for an attack, is complicated and usually requires the involvement of an experienced security expert. Prior attempts to automate attack attribution apply various machine learning techniques on features extracted from the malware's code and behavior in order to identify other similar malware whose authors are known. However, the same malware can be reused by multiple actors, and the actor who performed an attack using a malware might differ from the malware's author. Moreover, information collected during an incident may contain many clues about the identity of the attacker in addition to the malware used. In this paper, we propose a method of attack attribution based on textual analysis of threat intelligence reports, using state of the art algorithms and models from the fields of machine learning and natural language processing (NLP). We have developed a new text representation algorithm which captures the context of the words and requires minimal feature engineering. Our approach relies on vector space representation of incident reports derived from a small collection of labeled reports and a large corpus of general security literature. Both datasets have been made available to the research community. Experimental results show that the proposed representation can attribute attacks more accurately than the baselines' representations. In addition, we show how the proposed approach can be used to identify novel previously unseen threat actors and identify similarities between known threat actors. |
URL | https://ieeexplore.ieee.org/document/8823152 |
DOI | 10.1109/ISI.2019.8823152 |
Citation Key | perry_no-doubt_2019 |
- Metrics
- Training
- threat intelligence reports
- threat intelligence
- threat actors
- text representation algorithm
- text analysis
- Text
- Task Analysis
- security literature
- security analytics
- security
- pubcrawl
- NLP
- natural language processing
- Information Reuse and Security
- malware
- machine learning algorithms
- machine learning
- learning (artificial intelligence)
- invasive software
- Human behavior
- feature extraction
- composability
- classification
- attribution
- attack attribution
- Resiliency
- Compositionality