Visible to the public Efficient Network Intrusion Detection Using PCA-Based Dimensionality Reduction of Features

TitleEfficient Network Intrusion Detection Using PCA-Based Dimensionality Reduction of Features
Publication TypeConference Paper
Year of Publication2019
AuthorsAbdulhammed, R., Faezipour, M., Musafer, H., Abuzneid, A.
Conference Name2019 International Symposium on Networks, Computers and Communications (ISNCC)
Date PublishedJune 2019
PublisherIEEE
ISBN Number978-1-7281-1244-2
KeywordsBayes methods, Bayesian Network, belief networks, binary classification, CICIDS2017 network intrusion dataset, class distribution parameters, composability, detection rate, dimensionality reduction, feature dimensionality reduction approach, feature extraction, high-dimensional features, IDS, imbalanced class distributions, imbalanced data, imbalanced distribution, intrusion detection system, IP networks, learning (artificial intelligence), low-dimensional features, machine learning, Measurement, minority class instances, multiclass classification show, multiclass combined performance metric, network intrusion detection system, network traffic, PCA, PCA-based dimensionality reduction, principal component analysis, pubcrawl, resilience, Resiliency, security of data, Support vector machines
Abstract

Designing a machine learning based network intrusion detection system (IDS) with high-dimensional features can lead to prolonged classification processes. This is while low-dimensional features can reduce these processes. Moreover, classification of network traffic with imbalanced class distributions has posed a significant drawback on the performance attainable by most well-known classifiers. With the presence of imbalanced data, the known metrics may fail to provide adequate information about the performance of the classifier. This study first uses Principal Component Analysis (PCA) as a feature dimensionality reduction approach. The resulting low-dimensional features are then used to build various classifiers such as Random Forest (RF), Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for designing an IDS. The experimental findings with low-dimensional features in binary and multi-class classification show better performance in terms of Detection Rate (DR), F-Measure, False Alarm Rate (FAR), and Accuracy. Furthermore, in this paper, we apply a Multi-Class Combined performance metric Combi ned Mc with respect to class distribution through incorporating FAR, DR, Accuracy, and class distribution parameters. In addition, we developed a uniform distribution based balancing approach to handle the imbalanced distribution of the minority class instances in the CICIDS2017 network intrusion dataset. We were able to reduce the CICIDS2017 dataset's feature dimensions from 81 to 10 using PCA, while maintaining a high accuracy of 99.6% in multi-class and binary classification.

URLhttps://ieeexplore.ieee.org/document/8909140
DOI10.1109/ISNCC.2019.8909140
Citation Keyabdulhammed_efficient_2019