Biblio
The analysis of security-related event logs is an important step for the investigation of cyber-attacks. It allows tracing malicious activities and lets a security operator find out what has happened. However, since IT landscapes are growing in size and diversity, the amount of events and their highly different representations are becoming a Big Data challenge. Unfortunately, current solutions for the analysis of security-related events, so called Security Information and Event Management (SIEM) systems, are not able to keep up with the load. In this work, we propose a distributed SIEM platform that makes use of highly efficient distributed normalization and persists event data into an in-memory database. We implement the normalization on common distribution frameworks, i.e. Spark, Storm, Trident and Heron, and compare their performance with our custom-built distribution solution. Additionally, different tuning options are introduced and their speed advantage is presented. In the end, we show how the writing into an in-memory database can be tuned to achieve optimal persistence speed. Using the proposed approach, we are able to not only fully normalize, but also persist more than 20 billion events per day with relatively small client hardware. Therefore, we are confident that our approach can handle the load of events in even very large IT landscapes.