Visible to the public A Comprehensive Data Sampling Analysis Applied to the Classification of Rare IoT Network Intrusion Types

TitleA Comprehensive Data Sampling Analysis Applied to the Classification of Rare IoT Network Intrusion Types
Publication TypeConference Paper
Year of Publication2021
AuthorsSapre, Suchet, Islam, Khondkar, Ahmadi, Pouyan
Conference Name2021 IEEE 18th Annual Consumer Communications Networking Conference (CCNC)
KeywordsArtificial neural networks, composability, Data Sampling, generative adversarial networks, Internet of Things, Intrusion detection, machine learning, Measurement, Metrics, network intrusion, network intrusion detection, NSL-KDD, pubcrawl, resilience, Resiliency
AbstractWith the rapid growth of Internet of Things (IoT) network intrusion attacks, there is a critical need for sophisticated and comprehensive intrusion detection systems (IDSs). Classifying infrequent intrusion types such as root-to-local (R2L) and user-to-root (U2R) attacks is a reoccurring problem for IDSs. In this study, various data sampling and class balancing techniques-Generative Adversarial Network (GAN)-based oversampling, k-nearest-neighbor (kNN) oversampling, NearMiss-1 undersampling, and class weights-were used to resolve the severe class imbalance affecting U2R and R2L attacks in the NSL-KDD intrusion detection dataset. Artificial Neural Networks (ANNs) were trained on the adjusted datasets, and their performances were evaluated with a multitude of classification metrics. Here, we show that using no data sampling technique (baseline), GAN-based oversampling, and NearMiss-l undersampling, all with class weights, displayed high performances in identifying R2L and U2R attacks. Of these, the baseline with class weights had the highest overall performance with an F1-score of 0.11 and 0.22 for the identification of U2R and R2L attacks, respectively.
DOI10.1109/CCNC49032.2021.9369617
Citation Keysapre_comprehensive_2021