Visible to the public Suspicious Network Event Recognition Using Modified Stacking Ensemble Machine Learning

TitleSuspicious Network Event Recognition Using Modified Stacking Ensemble Machine Learning
Publication TypeConference Paper
Year of Publication2019
AuthorsHuang, Angus F.M., Chi-Wei, Yang, Tai, Hsiao-Chi, Chuan, Yang, Huang, Jay J.C., Liao, Yu-Han
Conference Name2019 IEEE International Conference on Big Data (Big Data)
Keywords2019 IEEE BigData Cup Challenge, AdaBoost, artificial intelligence-oriented automatic services, Big Data, big data security in the cloud, big-data analytics, Conferences, cyber-threats, Data analysis, Data preprocessing, Data Science, Ensemble Learning, ensemble learning., exploratory data analysis, extremely randomised trees, feature creation, feature selection, machine learning, Metrics, modified stacking ensemble machine learning, Network Event Log Analytics, network intrusions, network traffic alerts, neural nets, Neural networks, pattern classification, pubcrawl, Random Forest, random forests, resilience, Resiliency, Scalability, security of data, suspicious network event recognition dataset, suspicious network events
AbstractThis study aims to detect genuine suspicious events and false alarms within a dataset of network traffic alerts. The rapid development of cloud computing and artificial intelligence-oriented automatic services have enabled a large amount of data and information to be transmitted among network nodes. However, the amount of cyber-threats, cyberattacks, and network intrusions have increased in various domains of network environments. Based on the fields of data science and machine learning, this paper proposes a series of solutions involving data preprocessing, exploratory data analysis, new features creation, features selection, ensemble learning, models construction, and verification to identify suspicious network events. This paper proposes a modified form of stacking ensemble machine learning which includes AdaBoost, Neural Networks, Random Forest, LightGBM, and Extremely Randomised Trees (Extra Trees) to realise a high-performance classification. A suspicious network event recognition dataset for a security operations centre, which uses real network log observations from the 2019 IEEE BigData Cup Challenge, is used as an experimental dataset. This paper investigates the possibility of integrating big-data analytics, machine learning, and data science to improve intelligent cybersecurity.
DOI10.1109/BigData47090.2019.9006391
Citation Keyhuang_suspicious_2019