Title | A Comprehensive Data Sampling Analysis Applied to the Classification of Rare IoT Network Intrusion Types |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Sapre, Suchet, Islam, Khondkar, Ahmadi, Pouyan |
Conference Name | 2021 IEEE 18th Annual Consumer Communications Networking Conference (CCNC) |
Keywords | Artificial neural networks, composability, Data Sampling, generative adversarial networks, Internet of Things, Intrusion detection, machine learning, Measurement, Metrics, network intrusion, network intrusion detection, NSL-KDD, pubcrawl, resilience, Resiliency |
Abstract | With the rapid growth of Internet of Things (IoT) network intrusion attacks, there is a critical need for sophisticated and comprehensive intrusion detection systems (IDSs). Classifying infrequent intrusion types such as root-to-local (R2L) and user-to-root (U2R) attacks is a reoccurring problem for IDSs. In this study, various data sampling and class balancing techniques-Generative Adversarial Network (GAN)-based oversampling, k-nearest-neighbor (kNN) oversampling, NearMiss-1 undersampling, and class weights-were used to resolve the severe class imbalance affecting U2R and R2L attacks in the NSL-KDD intrusion detection dataset. Artificial Neural Networks (ANNs) were trained on the adjusted datasets, and their performances were evaluated with a multitude of classification metrics. Here, we show that using no data sampling technique (baseline), GAN-based oversampling, and NearMiss-l undersampling, all with class weights, displayed high performances in identifying R2L and U2R attacks. Of these, the baseline with class weights had the highest overall performance with an F1-score of 0.11 and 0.22 for the identification of U2R and R2L attacks, respectively. |
DOI | 10.1109/CCNC49032.2021.9369617 |
Citation Key | sapre_comprehensive_2021 |