Biblio
The analysis of security-related event logs is an important step for the investigation of cyber-attacks. It allows tracing malicious activities and lets a security operator find out what has happened. However, since IT landscapes are growing in size and diversity, the amount of events and their highly different representations are becoming a Big Data challenge. Unfortunately, current solutions for the analysis of security-related events, so called Security Information and Event Management (SIEM) systems, are not able to keep up with the load. In this work, we propose a distributed SIEM platform that makes use of highly efficient distributed normalization and persists event data into an in-memory database. We implement the normalization on common distribution frameworks, i.e. Spark, Storm, Trident and Heron, and compare their performance with our custom-built distribution solution. Additionally, different tuning options are introduced and their speed advantage is presented. In the end, we show how the writing into an in-memory database can be tuned to achieve optimal persistence speed. Using the proposed approach, we are able to not only fully normalize, but also persist more than 20 billion events per day with relatively small client hardware. Therefore, we are confident that our approach can handle the load of events in even very large IT landscapes.
Intrusion detection systems define an important and dynamic research area for cybersecurity. The role of Intrusion Detection System within security architecture is to improve a security level by identification of all malicious and also suspicious events that could be observed in computer or network system. One of the more specific research areas related to intrusion detection is anomaly detection. Anomaly-based intrusion detection in networks refers to the problem of finding untypical events in the observed network traffic that do not conform to the expected normal patterns. It is assumed that everything that is untypical/anomalous could be dangerous and related to some security events. To detect anomalies many security systems implements a classification or clustering algorithms. However, recent research proved that machine learning models might misclassify adversarial events, e.g. observations which were created by applying intentionally non-random perturbations to the dataset. Such weakness could increase of false negative rate which implies undetected attacks. This fact can lead to one of the most dangerous vulnerabilities of intrusion detection systems. The goal of the research performed was verification of the anomaly detection systems ability to resist this type of attack. This paper presents the preliminary results of tests taken to investigate existence of attack vector, which can use adversarial examples to conceal a real attack from being detected by intrusion detection systems.
The paper suggests several techniques for computer network risk assessment based on Common Vulnerability Scoring System (CVSS) and attack modeling. Techniques use a set of integrated security metrics and consider input data from security information and event management (SIEM) systems. Risk assessment techniques differ according to the used input data. They allow to get risk assessment considering requirements to the accuracy and efficiency. Input data includes network characteristics, attacks, attacker characteristics, security events and countermeasures. The tool that implements these techniques is presented. Experiments demonstrate operation of the techniques for different security situations.
Security companies have recently realised that mining massive amounts of security data can help generate actionable intelligence and improve their understanding of Internet attacks. In particular, attack attribution and situational understanding are considered critical aspects to effectively deal with emerging, increasingly sophisticated Internet attacks. This requires highly scalable analysis tools to help analysts classify, correlate and prioritise security events, depending on their likely impact and threat level. However, this security data mining process typically involves a considerable amount of features interacting in a non-obvious way, which makes it inherently complex. To deal with this challenge, we introduce MR-TRIAGE, a set of distributed algorithms built on MapReduce that can perform scalable multi-criteria data clustering on large security data sets and identify complex relationships hidden in massive datasets. The MR-TRIAGE workflow is made of a scalable data summarisation, followed by scalable graph clustering algorithms in which we integrate multi-criteria evaluation techniques. Theoretical computational complexity of the proposed parallel algorithms are discussed and analysed. The experimental results demonstrate that the algorithms can scale well and efficiently process large security datasets on commodity hardware. Our approach can effectively cluster any type of security events (e.g., spam emails, spear-phishing attacks, etc) that are sharing at least some commonalities among a number of predefined features.