Visible to the public What database do you choose for heterogeneous security log events analysis?

TitleWhat database do you choose for heterogeneous security log events analysis?
Publication TypeConference Paper
Year of Publication2021
AuthorsLagraa, Sofiane, State, Radu
Conference Name2021 IFIP/IEEE International Symposium on Integrated Network Management (IM)
KeywordsCluster computing, composability, Databases, Firewalls (computing), Human Behavior, Loading, Metrics, NoSQL databases, pubcrawl, relational database security, relational databases, resilience, Resiliency, Structured Query Language
AbstractThe heterogeneous massive logs incoming from multiple sources pose major challenges to professionals responsible for IT security and system administrator. One of the challenges is to develop a scalable heterogeneous logs database for storage and further analysis. In fact, it is difficult to decide which database is suitable for the needs, the best of a use case, execution time and storage performances. In this paper, we explore, study, and compare the performance of SQL and NoSQL databases on large heterogeneous event logs. We implement the relational database using MySQL, the column-oriented database using Impala on the top of Hadoop, and the graph database using Neo4j. We experiment the databases on a large heterogeneous logs and provide advice, the pros and cons of each SQL and NoSQL database. Our findings that Impala outperforms MySQL and Neo4j databases in terms of loading logs, execution time of simple queries, and storage of logs. However, Neo4j outperforms Impala and MySQL in the execution time of complex queries.
Citation Keylagraa_what_2021