Visible to the public Biblio

Filters: Author is Khoshgoftaar, Taghi M.  [Clear All Filters]
2022-06-14
Zuech, Richard, Hancock, John, Khoshgoftaar, Taghi M..  2021.  Feature Popularity Between Different Web Attacks with Supervised Feature Selection Rankers. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). :30–37.
We introduce the novel concept of feature popularity with three different web attacks and big data from the CSE-CIC-IDS2018 dataset: Brute Force, SQL Injection, and XSS web attacks. Feature popularity is based upon ensemble Feature Selection Techniques (FSTs) and allows us to more easily understand common important features between different cyberattacks, for two main reasons. First, feature popularity lists can be generated to provide an easy comprehension of important features across different attacks. Second, the Jaccard similarity metric can provide a quantitative score for how similar feature subsets are between different attacks. Both of these approaches not only provide more explainable and easier-to-understand models, but they can also reduce the complexity of implementing models in real-world systems. Four supervised learning-based FSTs are used to generate feature subsets for each of our three different web attack datasets, and then our feature popularity frameworks are applied. For these three web attacks, the XSS and SQL Injection feature subsets are the most similar per the Jaccard similarity. The most popular features across all three web attacks are: Flow\_Bytes\_s, FlowİAT\_Max, and Flow\_Packets\_s. While this introductory study is only a simple example using only three web attacks, this feature popularity concept can be easily extended, allowing an automated framework to more easily determine the most popular features across a very large number of attacks and features.
Hancock, John, Khoshgoftaar, Taghi M., Leevy, Joffrey L..  2021.  Detecting SSH and FTP Brute Force Attacks in Big Data. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). :760–765.
We present a simple approach for detecting brute force attacks in the CSE-CIC-IDS2018 Big Data dataset. We show our approach is preferable to more complex approaches since it is simpler, and yields stronger classification performance. Our contribution is to show that it is possible to train and test simple Decision Tree models with two independent variables to classify CSE-CIC-IDS2018 data with better results than reported in previous research, where more complex Deep Learning models are employed. Moreover, we show that Decision Tree models trained on data with two independent variables perform similarly to Decision Tree models trained on a larger number independent variables. Our experiments reveal that simple models, with AUC and AUPRC scores greater than 0.99, are capable of detecting brute force attacks in CSE-CIC-IDS2018. To the best of our knowledge, these are the strongest performance metrics published for the machine learning task of detecting these types of attacks. Furthermore, the simplicity of our approach, combined with its strong performance, makes it an appealing technique.
2022-03-01
Leevy, Joffrey L., Hancock, John, Khoshgoftaar, Taghi M., Seliya, Naeem.  2021.  IoT Reconnaissance Attack Classification with Random Undersampling and Ensemble Feature Selection. 2021 IEEE 7th International Conference on Collaboration and Internet Computing (CIC). :41–49.
The exponential increase in the use of Internet of Things (IoT) devices has been accompanied by a spike in cyberattacks on IoT networks. In this research, we investigate the Bot-IoT dataset with a focus on classifying IoT reconnaissance attacks. Reconnaissance attacks are a foundational step in the cyberattack lifecycle. Our contribution is centered on the building of predictive models with the aid of Random Undersampling (RUS) and ensemble Feature Selection Techniques (FSTs). As far as we are aware, this type of experimentation has never been performed for the Reconnaissance attack category of Bot-IoT. Our work uses the Area Under the Receiver Operating Characteristic Curve (AUC) metric to quantify the performance of a diverse range of classifiers: Light GBM, CatBoost, XGBoost, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), and a Multilayer Perceptron (MLP). For this study, we determined that the best learners are DT and DT-based ensemble classifiers, the best RUS ratio is 1:1 or 1:3, and the best ensemble FST is our ``6 Agree'' technique.
2020-08-28
Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Leevy, Joffrey L..  2019.  A Comparison of Performance Metrics with Severely Imbalanced Network Security Big Data. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :83—88.

Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.

2020-04-03
Calvert, Chad L., Khoshgoftaar, Taghi M..  2019.  Threshold Based Optimization of Performance Metrics with Severely Imbalanced Big Security Data. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). :1328—1334.

Proper evaluation of classifier predictive models requires the selection of appropriate metrics to gauge the effectiveness of a model's performance. The Area Under the Receiver Operating Characteristic Curve (AUC) has become the de facto standard metric for evaluating this classifier performance. However, recent studies have suggested that AUC is not necessarily the best metric for all types of datasets, especially those in which there exists a high or severe level of class imbalance. There is a need to assess which specific metrics are most beneficial to evaluate the performance of highly imbalanced big data. In this work, we evaluate the performance of eight machine learning techniques on a severely imbalanced big dataset pertaining to the cyber security domain. We analyze the behavior of six different metrics to determine which provides the best representation of a model's predictive performance. We also evaluate the impact that adjusting the classification threshold has on our metrics. Our results find that the C4.5N decision tree is the optimal learner when evaluating all presented metrics for severely imbalanced Slow HTTP DoS attack data. Based on our results, we propose that the use of AUC alone as a primary metric for evaluating highly imbalanced big data may be ineffective, and the evaluation of metrics such as F-measure and Geometric mean can offer substantial insight into the true performance of a given model.