Biblio
The task of attack attribution, i.e., identifying the entity responsible for an attack, is complicated and usually requires the involvement of an experienced security expert. Prior attempts to automate attack attribution apply various machine learning techniques on features extracted from the malware's code and behavior in order to identify other similar malware whose authors are known. However, the same malware can be reused by multiple actors, and the actor who performed an attack using a malware might differ from the malware's author. Moreover, information collected during an incident may contain many clues about the identity of the attacker in addition to the malware used. In this paper, we propose a method of attack attribution based on textual analysis of threat intelligence reports, using state of the art algorithms and models from the fields of machine learning and natural language processing (NLP). We have developed a new text representation algorithm which captures the context of the words and requires minimal feature engineering. Our approach relies on vector space representation of incident reports derived from a small collection of labeled reports and a large corpus of general security literature. Both datasets have been made available to the research community. Experimental results show that the proposed representation can attribute attacks more accurately than the baselines' representations. In addition, we show how the proposed approach can be used to identify novel previously unseen threat actors and identify similarities between known threat actors.
Security companies have recently realised that mining massive amounts of security data can help generate actionable intelligence and improve their understanding of Internet attacks. In particular, attack attribution and situational understanding are considered critical aspects to effectively deal with emerging, increasingly sophisticated Internet attacks. This requires highly scalable analysis tools to help analysts classify, correlate and prioritise security events, depending on their likely impact and threat level. However, this security data mining process typically involves a considerable amount of features interacting in a non-obvious way, which makes it inherently complex. To deal with this challenge, we introduce MR-TRIAGE, a set of distributed algorithms built on MapReduce that can perform scalable multi-criteria data clustering on large security data sets and identify complex relationships hidden in massive datasets. The MR-TRIAGE workflow is made of a scalable data summarisation, followed by scalable graph clustering algorithms in which we integrate multi-criteria evaluation techniques. Theoretical computational complexity of the proposed parallel algorithms are discussed and analysed. The experimental results demonstrate that the algorithms can scale well and efficiently process large security datasets on commodity hardware. Our approach can effectively cluster any type of security events (e.g., spam emails, spear-phishing attacks, etc) that are sharing at least some commonalities among a number of predefined features.
Computing systems today have a large number of security configuration settings that enforce security properties. However, vulnerabilities and incorrect configuration increase the potential for attacks. Provable verification and simulation tools have been introduced to eliminate configuration conflicts and weaknesses, which can increase system robustness against attacks. Most of these tools require special knowledge in formal methods and precise specification for requirements in special languages, in addition to their excessive need for computing resources. Video games have been utilized by researchers to make educational software more attractive and engaging. Publishing these games for crowdsourcing can also stimulate competition between players and increase the game educational value. In this paper we introduce a game interface, called NetMaze, that represents the network configuration verification problem as a video game and allows for attack analysis. We aim to make the security analysis and hardening usable and accurately achievable, using the power of video games and the wisdom of crowdsourcing. Players can easily discover weaknesses in network configuration and investigate new attack scenarios. In addition, the gameplay scenarios can also be used to analyze and learn attack attribution considering human factors. In this paper, we present a provable mapping from the network configuration to 3D game objects.