Biblio
The task of attack attribution, i.e., identifying the entity responsible for an attack, is complicated and usually requires the involvement of an experienced security expert. Prior attempts to automate attack attribution apply various machine learning techniques on features extracted from the malware's code and behavior in order to identify other similar malware whose authors are known. However, the same malware can be reused by multiple actors, and the actor who performed an attack using a malware might differ from the malware's author. Moreover, information collected during an incident may contain many clues about the identity of the attacker in addition to the malware used. In this paper, we propose a method of attack attribution based on textual analysis of threat intelligence reports, using state of the art algorithms and models from the fields of machine learning and natural language processing (NLP). We have developed a new text representation algorithm which captures the context of the words and requires minimal feature engineering. Our approach relies on vector space representation of incident reports derived from a small collection of labeled reports and a large corpus of general security literature. Both datasets have been made available to the research community. Experimental results show that the proposed representation can attribute attacks more accurately than the baselines' representations. In addition, we show how the proposed approach can be used to identify novel previously unseen threat actors and identify similarities between known threat actors.
Threat actors are constantly seeking new attack surfaces, with ransomeware being one the most successful attack vectors that have been used for financial gain. This has been achieved through the dispersion of unlimited polymorphic samples of ransomware whilst those responsible evade detection and hide their identity. Nonetheless, every ransomware threat actor adopts some similar style or uses some common patterns in their malicious code writing, which can be significant evidence contributing to their identification. he first step in attempting to identify the source of the attack is to cluster a large number of ransomware samples based on very little or no information about the samples, accordingly, their traits and signatures can be analysed and identified. T herefore, this paper proposes an efficient fuzzy analysis approach to cluster ransomware samples based on the combination of two fuzzy techniques fuzzy hashing and fuzzy c-means (FCM) clustering. Unlike other clustering techniques, FCM can directly utilise similarity scores generated by a fuzzy hashing method and cluster them into similar groups without requiring additional transformational steps to obtain distance among objects for clustering. Thus, it reduces the computational overheads by utilising fuzzy similarity scores obtained at the time of initial triaging of whether the sample is known or unknown ransomware. The performance of the proposed fuzzy method is compared against k-means clustering and the two fuzzy hashing methods SSDEEP and SDHASH which are evaluated based on their FCM clustering results to understand how the similarity score affects the clustering results.
Cyber Physical Systems (CPS) operating in modern critical infrastructures (CIs) are increasingly being targeted by highly sophisticated cyber attacks. Threat actors have quickly learned of the value and potential impact of targeting CPS, and numerous tailored multi-stage cyber-physical attack campaigns, such as Advanced Persistent Threats (APTs), have been perpetrated in the last years. They aim at stealthily compromising systems' operations and cause severe impact on daily business operations such as shutdowns, equipment damage, reputation damage, financial loss, intellectual property theft, and health and safety risks. Protecting CIs against such threats has become as crucial as complicated. Novel distributed detection and reaction methodologies are necessary to effectively uncover these attacks, and timely mitigate their effects. Correlating large amounts of data, collected from a multitude of relevant sources, is fundamental for Security Operation Centers (SOCs) to establish cyber situational awareness, and allow to promptly adopt suitable countermeasures in case of attacks. In our previous work we introduced three methods for security information correlation. In this paper we define metrics and benchmarks to evaluate these correlation methods, we assess their accuracy, and we compare their performance. We finally demonstrate how the presented techniques, implemented within our cyber threat intelligence analysis engine called CAESAIR, can be applied to support incident handling tasks performed by SOCs.