Biblio
Botnets are one of the major threats on the Internet. They are used for malicious activities to compromise the basic network security goals, namely Confidentiality, Integrity, and Availability. For reliable botnet detection and defense, deep learning-based approaches were recently proposed. In this paper, four different deep learning models, namely Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), hybrid CNN-LSTM, and Multi-layer Perception (MLP) are applied for botnet detection and simulation studies are carried out using the CTU-13 botnet traffic dataset. We use several performance metrics such as accuracy, sensitivity, specificity, precision, and F1 score to evaluate the performance of each model on classifying both known and unknown (zero-day) botnet traffic patterns. The results show that our deep learning models can accurately and reliably detect both known and unknown botnet traffic, and show better performance than other deep learning models.
Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.
Proper evaluation of classifier predictive models requires the selection of appropriate metrics to gauge the effectiveness of a model's performance. The Area Under the Receiver Operating Characteristic Curve (AUC) has become the de facto standard metric for evaluating this classifier performance. However, recent studies have suggested that AUC is not necessarily the best metric for all types of datasets, especially those in which there exists a high or severe level of class imbalance. There is a need to assess which specific metrics are most beneficial to evaluate the performance of highly imbalanced big data. In this work, we evaluate the performance of eight machine learning techniques on a severely imbalanced big dataset pertaining to the cyber security domain. We analyze the behavior of six different metrics to determine which provides the best representation of a model's predictive performance. We also evaluate the impact that adjusting the classification threshold has on our metrics. Our results find that the C4.5N decision tree is the optimal learner when evaluating all presented metrics for severely imbalanced Slow HTTP DoS attack data. Based on our results, we propose that the use of AUC alone as a primary metric for evaluating highly imbalanced big data may be ineffective, and the evaluation of metrics such as F-measure and Geometric mean can offer substantial insight into the true performance of a given model.
Software-defined wireless sensor cognitive radio network is one of the emerging technologies which is simple, agile, and flexible. The sensor network comprises of a sink node with high processing power. The sensed data is transferred to the sink node in a hop-by-hop basis by sensor nodes. The network is programmable, automated, agile, and flexible. The sensor nodes are equipped with cognitive radios, which sense available spectrum bands and transmit sensed data on available bands, which improves spectrum utilization. Unfortunately, the Software-defined wireless sensor cognitive radio network is prone to security issues. The sinkhole attack is the most common attack which can also be used to launch other attacks. We propose and evaluate the performance of Hop Count-Based Sinkhole Attack detection Algorithm (HCOBASAA) using probability of detection, probability of false negative, and probability of false positive as the performance metrics. On average HCOBASAA managed to yield 100%, 75%, and 70% probability of detection.
Image retrieval systems have been an active area of research for more than thirty years progressively producing improved algorithms that improve performance metrics, operate in different domains, take advantage of different features extracted from the images to be retrieved, and have different desirable invariance properties. With the ever-growing visual databases of images and videos produced by a myriad of devices comes the challenge of selecting effective features and performing fast retrieval on such databases. In this paper, we incorporate Fourier descriptors (FD) along with a metric-based balanced indexing tree as a viable solution to DHS (Department of Homeland Security) needs to for quick identification and retrieval of weapon images. The FDs allow a simple but effective outline feature representation of an object, while the M-tree provide a dynamic, fast, and balanced search over such features. Motivated by looking for applications of interest to DHS, we have created a basic guns and rifles databases that can be used to identify weapons in images and videos extracted from media sources. Our simulations show excellent performance in both representation and fast retrieval speed.
Cognitive radio networks (CRNs) enable secondary users (SU) to make use of licensed spectrum without interfering with the signal generated by primary users (PUs). To avoid such interference, the SU is required to sense the medium for a period of time and eventually use it only if the band is perceived to be idle. In this context, the encryption process is carried out for the SU requests prior to their transmission whilst the strength of the security in CRNs is directly proportional to the length of the encryption key. If a request of a PU on arrival finds an SU request being either encrypted or transmitted, then the SU is preempted from service. However, excessive sensing time for the detection of free spectrum by SUs as well as extended periods of the CRN being at an insecure state have an adverse impact on network performance. To this end, a generalized stochastic Petri net (GSPN) is proposed in order to investigate sensing vs. security vs. performance trade-offs, leading to an efficient use of the spectrum band. Typical numerical simulation experiments are carried out, based on the application of the Mobius Petri Net Package and associated interpretations are made.