Suspicious Network Event Recognition Using Modified Stacking Ensemble Machine Learning

Submitted by grigby1 on Fri, 08/28/2020 - 3:34pm

Title	Suspicious Network Event Recognition Using Modified Stacking Ensemble Machine Learning
Publication Type	Conference Paper
Year of Publication	2019
Authors	Huang, Angus F.M., Chi-Wei, Yang, Tai, Hsiao-Chi, Chuan, Yang, Huang, Jay J.C., Liao, Yu-Han
Conference Name	2019 IEEE International Conference on Big Data (Big Data)
Keywords	2019 IEEE BigData Cup Challenge, AdaBoost, artificial intelligence-oriented automatic services, Big Data, big data security in the cloud, big-data analytics, Conferences, cyber-threats, Data analysis, Data preprocessing, Data Science, Ensemble Learning, ensemble learning., exploratory data analysis, extremely randomised trees, feature creation, feature selection, machine learning, Metrics, modified stacking ensemble machine learning, Network Event Log Analytics, network intrusions, network traffic alerts, neural nets, Neural networks, pattern classification, pubcrawl, Random Forest, random forests, resilience, Resiliency, Scalability, security of data, suspicious network event recognition dataset, suspicious network events
Abstract	This study aims to detect genuine suspicious events and false alarms within a dataset of network traffic alerts. The rapid development of cloud computing and artificial intelligence-oriented automatic services have enabled a large amount of data and information to be transmitted among network nodes. However, the amount of cyber-threats, cyberattacks, and network intrusions have increased in various domains of network environments. Based on the fields of data science and machine learning, this paper proposes a series of solutions involving data preprocessing, exploratory data analysis, new features creation, features selection, ensemble learning, models construction, and verification to identify suspicious network events. This paper proposes a modified form of stacking ensemble machine learning which includes AdaBoost, Neural Networks, Random Forest, LightGBM, and Extremely Randomised Trees (Extra Trees) to realise a high-performance classification. A suspicious network event recognition dataset for a security operations centre, which uses real network log observations from the 2019 IEEE BigData Cup Challenge, is used as an experimental dataset. This paper investigates the possibility of integrating big-data analytics, machine learning, and data science to improve intelligent cybersecurity.
DOI	10.1109/BigData47090.2019.9006391
Citation Key	huang_suspicious_2019

Groups:

Science of Security VO