Biblio
Network traffic anomaly detection is of critical importance in cybersecurity due to the massive and rapid growth of sophisticated computer network attacks. Indeed, the more new Internet-related technologies are created, the more elaborate the attacks become. Among all the contemporary high-level attacks, dictionary-based brute-force attacks (BFA) present one of the most unsurmountable challenges. We need to develop effective methods to detect and mitigate such brute-force attacks in realtime. In this paper, we investigate SSH and FTP brute-force attack detection by using the Long Short-Term Memory (LSTM) deep learning approach. Additionally, we made use of machine learning (ML) classifiers: J48, naive Bayes (NB), decision table (DT), random forest (RF) and k-nearest-neighbor (k-NN), for additional detection purposes. We used the well-known labelled dataset CICIDS2017. We evaluated the effectiveness of the LSTM and ML algorithms, and compared their performance. Our results show that the LSTM model outperforms the ML algorithms, with an accuracy of 99.88%.
In this paper, we propose a deep learning framework for malware classification. There has been a huge increase in the volume of malware in recent years which poses a serious security threat to financial institutions, businesses and individuals. In order to combat the proliferation of malware, new strategies are essential to quickly identify and classify malware samples so that their behavior can be analyzed. Machine learning approaches are becoming popular for classifying malware, however, most of the existing machine learning methods for malware classification use shallow learning algorithms (e.g. SVM). Recently, Convolutional Neural Networks (CNN), a deep learning approach, have shown superior performance compared to traditional learning algorithms, especially in tasks such as image classification. Motivated by this success, we propose a CNN-based architecture to classify malware samples. We convert malware binaries to grayscale images and subsequently train a CNN for classification. Experiments on two challenging malware classification datasets, Malimg and Microsoft malware, demonstrate that our method achieves better than the state-of-the-art performance. The proposed method achieves 98.52% and 99.97% accuracy on the Malimg and Microsoft datasets respectively.
Security has always been a major issue in cloud. Data sources are the most valuable and vulnerable information which is aimed by attackers to steal. If data is lost, then the privacy and security of every cloud user are compromised. Even though a cloud network is secured externally, the threat of an internal attacker exists. Internal attackers compromise a vulnerable user node and get access to a system. They are connected to the cloud network internally and launch attacks pretending to be trusted users. Machine learning approaches are widely used for cloud security issues. The existing machine learning based security approaches classify a node as a misbehaving node based on short-term behavioral data. These systems do not differentiate whether a misbehaving node is a malicious node or a broken node. To address this problem, this paper proposes an Improvised Long Short-Term Memory (ILSTM) model which learns the behavior of a user and automatically trains itself and stores the behavioral data. The model can easily classify the user behavior as normal or abnormal. The proposed ILSTM not only identifies an anomaly node but also finds whether a misbehaving node is a broken node or a new user node or a compromised node using the calculated trust factor. The proposed model not only detects the attack accurately but also reduces the false alarm in the cloud network.
Automatic detection of TV advertisements is of paramount importance for various media monitoring agencies. Existing works in this domain have mostly focused on news channels using news specific features. Most commercial products use near copy detection algorithms instead of generic advertisement classification. A generic detector needs to handle inter-class and intra-class imbalances present in data due to variability in content aired across channels and frequent repetition of advertisements. Imbalances present in data make classifiers biased towards one of the classes and thus require special treatment. We propose to use tree of perceptrons to solve this problem. The training data available for each perceptron node is balanced using cluster based over-sampling and TOMEK link cleaning as we traverse the tree downwards. The trained perceptron node then passes the original unbalanced data to its children. This process is repeated recursively till we reach the leaf nodes. We call this new algorithm as "Progressively Balanced Perceptron Tree". We have also contributed a TV advertisements dataset consisting of 250 hours of videos recorded from five non-news TV channels of different genres. Experimentations on this dataset have shown that the proposed approach has comparatively superior and balanced performance with respect to six baseline methods. Our proposal generalizes well across channels, with varying training data sizes and achieved a top F1-score of 97% in detecting advertisements.