A Comprehensive Data Sampling Analysis Applied to the Classification of Rare IoT Network Intrusion Types

Submitted by grigby1 on Tue, 03/01/2022 - 4:14pm

Title	A Comprehensive Data Sampling Analysis Applied to the Classification of Rare IoT Network Intrusion Types
Publication Type	Conference Paper
Year of Publication	2021
Authors	Sapre, Suchet, Islam, Khondkar, Ahmadi, Pouyan
Conference Name	2021 IEEE 18th Annual Consumer Communications Networking Conference (CCNC)
Keywords	Artificial neural networks, composability, Data Sampling, generative adversarial networks, Internet of Things, Intrusion detection, machine learning, Measurement, Metrics, network intrusion, network intrusion detection, NSL-KDD, pubcrawl, resilience, Resiliency
Abstract	With the rapid growth of Internet of Things (IoT) network intrusion attacks, there is a critical need for sophisticated and comprehensive intrusion detection systems (IDSs). Classifying infrequent intrusion types such as root-to-local (R2L) and user-to-root (U2R) attacks is a reoccurring problem for IDSs. In this study, various data sampling and class balancing techniques-Generative Adversarial Network (GAN)-based oversampling, k-nearest-neighbor (kNN) oversampling, NearMiss-1 undersampling, and class weights-were used to resolve the severe class imbalance affecting U2R and R2L attacks in the NSL-KDD intrusion detection dataset. Artificial Neural Networks (ANNs) were trained on the adjusted datasets, and their performances were evaluated with a multitude of classification metrics. Here, we show that using no data sampling technique (baseline), GAN-based oversampling, and NearMiss-l undersampling, all with class weights, displayed high performances in identifying R2L and U2R attacks. Of these, the baseline with class weights had the highest overall performance with an F1-score of 0.11 and 0.22 for the identification of U2R and R2L attacks, respectively.
DOI	10.1109/CCNC49032.2021.9369617
Citation Key	sapre_comprehensive_2021

Groups:

Science of Security VO