Biblio
Conventional methods for anomaly detection include techniques based on clustering, proximity or classification. With the rapidly growing social networks, outliers or anomalies find ingenious ways to obscure themselves in the network and making the conventional techniques inefficient. In this paper, we utilize the ability of Deep Learning over topological characteristics of a social network to detect anomalies in email network and twitter network. We present a model, Graph Neural Network, which is applied on social connection graphs to detect anomalies. The combinations of various social network statistical measures are taken into account to study the graph structure and functioning of the anomalous nodes by employing deep neural networks on it. The hidden layer of the neural network plays an important role in finding the impact of statistical measure combination in anomaly detection.
With the remarkable success of deep learning, Deep Neural Networks (DNNs) have been applied as dominant tools to various machine learning domains. Despite this success, however, it has been found that DNNs are surprisingly vulnerable to malicious attacks; adding a small, perceptually indistinguishable perturbations to the data can easily degrade classification performance. Adversarial training is an effective defense strategy to train a robust classifier. In this work, we propose to utilize the generator to learn how to create adversarial examples. Unlike the existing approaches that create a one-shot perturbation by a deterministic generator, we propose a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier. Our experiment results on MNIST and CIFAR-10 datasets show that the classifier adversarially trained with our method yields more robust performance over various white-box and black-box attacks.
Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks. Motivated by this observation, we develop new network architectures that increase adversarial robustness by performing feature denoising. Specifically, our networks contain blocks that denoise the features using non-local means or other filters; the entire networks are trained end-to-end. When combined with adversarial training, our feature denoising networks substantially improve the state-of-the-art in adversarial robustness in both white-box and black-box attack settings. On ImageNet, under 10-iteration PGD white-box attacks where prior art has 27.9% accuracy, our method achieves 55.7%; even under extreme 2000-iteration PGD white-box attacks, our method secures 42.6% accuracy. Our method was ranked first in Competition on Adversarial Attacks and Defenses (CAAD) 2018 — it achieved 50.6% classification accuracy on a secret, ImageNet-like test dataset against 48 unknown attackers, surpassing the runner-up approach by 10%. Code is available at https://github.com/facebookresearch/ImageNet-Adversarial-Training.
Phishing is typically deployed as an attack vector in the initial stages of a hacking endeavour. Due to it low-risk rightreward nature it has seen a widespread adoption, and detecting it has become a challenge in recent times. This paper proposes a novel means of detecting phishing websites using a Generative Adversarial Network. Taking into account the internal structure and external metadata of a website, the proposed approach uses a generator network which generates both legitimate as well as synthetic phishing features to train a discriminator network. The latter then determines if the features are either normal or phishing websites, before improving its detection accuracy based on the classification error. The proposed approach is evaluated using two different phishing datasets and is found to achieve a detection accuracy of up to 94%.
In this work, we applied deep semantic analysis, and machine learning and deep learning techniques, to capture inherent characteristics of email text, and classify emails as phishing or non -phishing.
Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.
Proper evaluation of classifier predictive models requires the selection of appropriate metrics to gauge the effectiveness of a model's performance. The Area Under the Receiver Operating Characteristic Curve (AUC) has become the de facto standard metric for evaluating this classifier performance. However, recent studies have suggested that AUC is not necessarily the best metric for all types of datasets, especially those in which there exists a high or severe level of class imbalance. There is a need to assess which specific metrics are most beneficial to evaluate the performance of highly imbalanced big data. In this work, we evaluate the performance of eight machine learning techniques on a severely imbalanced big dataset pertaining to the cyber security domain. We analyze the behavior of six different metrics to determine which provides the best representation of a model's predictive performance. We also evaluate the impact that adjusting the classification threshold has on our metrics. Our results find that the C4.5N decision tree is the optimal learner when evaluating all presented metrics for severely imbalanced Slow HTTP DoS attack data. Based on our results, we propose that the use of AUC alone as a primary metric for evaluating highly imbalanced big data may be ineffective, and the evaluation of metrics such as F-measure and Geometric mean can offer substantial insight into the true performance of a given model.
In parallel with the increasing growth of the Internet and computer networks, the number of malwares has been increasing every day. Today, one of the newest attacks and the biggest threats in cybersecurity is ransomware. The effectiveness of applying machine learning techniques for malware detection has been explored in much scientific research, however, there is few studies focused on machine learning-based ransomware detection. In this paper, the effectiveness of ransomware detection using machine learning methods applied to CICAndMal2017 dataset is examined in two experiments. First, the classifiers are trained on a single dataset containing different types of ransomware. Second, different classifiers are trained on datasets of 10 ransomware families distinctly. Our findings imply that in both experiments random forest outperforms other tested classifiers and the performance of the classifiers are not changed significantly when they are trained on each family distinctly. Therefore, the random forest classification method is very effective in ransomware detection.
The fast growing of ransomware attacks has become a serious threat for companies, governments and internet users, in recent years. The increasing of computing power, memory and etc. and the advance in cryptography has caused the complicating the ransomware attacks. Therefore, effective methods are required to deal with ransomwares. Although, there are many methods proposed for ransomware detection, but these methods are inefficient in detection ransomwares, and more researches are still required in this field. In this paper, we have proposed a novel method for identify ransomware from benign software using process mining methods. The proposed method uses process mining to discover the process model from the events logs, and then extracts features from this process model and using these features and classification algorithms to classify ransomwares. This paper shows that the use of classification algorithms along with the process mining can be suitable to identify ransomware. The accuracy and performance of our proposed method is evaluated using a study of 21 ransomware families and some benign samples. The results show j48 and random forest algorithms have the best accuracy in our method and can achieve to 95% accuracy in detecting ransomwares.
In recent days, Enterprises are expanding their business efficiently through web applications which has paved the way for building good consumer relationship with its customers. The major threat faced by these enterprises is their inability to provide secure environments as the web applications are prone to severe vulnerabilities. As a result of this, many security standards and tools have been evolving to handle the vulnerabilities. Though there are many vulnerability detection tools available in the present, they do not provide sufficient information on the attack. For the long-term functioning of an organization, data along with efficient analytics on the vulnerabilities is required to enhance its reliability. The proposed model thus aims to make use of Machine Learning with Analytics to solve the problem in hand. Hence, the sequence of the attack is detected through the pattern using PAA and further the detected vulnerabilities are classified using Machine Learning technique such as SVM. Probabilistic results are provided in order to obtain numerical data sets which could be used for obtaining a report on user and application behavior. Dynamic and Reconfigurable PAA with SVM Classifier is a challenging task to analyze the vulnerabilities and impact of these vulnerabilities in heterogeneous web environment. This will enhance the former processing by analysis of the origin and the pattern of the attack in a more effective manner. Hence, the proposed system is designed to perform detection of attacks. The system works on the mitigation and prevention as part of the attack prediction.
In today's time Software Defined Network (SDN) gives the complete control to get the data flow in the network. SDN works as a central point to which data is administered centrally and traffic is also managed. SDN being open source product is more prone to security threats. The security policies are also to be enforced as it would otherwise let the controller be attacked the most. The attacks like DDOS and DOS attacks are more commonly found in SDN controller. DDOS is destructive attack that normally diverts the normal flow of traffic and starts the over flow of flooded packets halting the system. Machine Learning techniques helps to identify the hidden and unexpected pattern of the network and hence helps in analyzing the network flow. All the classified and unclassified techniques can help detect the malicious flow based on certain parameters like packet flow, time duration, accuracy and precision rate. Researchers have used Bayesian Network, Wavelets, Support Vector Machine and KNN to detect DDOS attacks. As per the review it's been analyzed that KNN produces better result as per the higher precision and giving a lower falser rate for detection. This paper produces better approach of hybrid Machine Learning techniques rather than existing KNN on the same data set giving more accuracy of detecting DDOS attacks on higher precision rate. The result of the traffic with both normal and abnormal behavior is shown and as per the result the proposed algorithm is designed which is suited for giving better approach than KNN and will be implemented later on for future.
Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.
With the unprecedented prevalence of mobile network applications, cryptographic protocols, such as the Secure Socket Layer/Transport Layer Security (SSL/TLS), are widely used in mobile network applications for communication security. The proven methods for encrypted video stream classification or encrypted protocol detection are unsuitable for the SSL/TLS traffic. Consequently, application-level traffic classification based networking and security services are facing severe challenges in effectiveness. Existing encrypted traffic classification methods exhibit unsatisfying accuracy for applications with similar state characteristics. In this paper, we propose a multiple-attribute-based encrypted traffic classification system named Multi-Attribute Associated Fingerprints (MAAF). We develop MAAF based on the two key insights that the DNS traces generated during the application runtime contain classification guidance information and that the handshake certificates in the encrypted flows can provide classification clues. Apart from the exploitation of key insights, MAAF employs the context of the encrypted traffic to overcome the attribute-lacking problem during the classification. Our experimental results demonstrate that MAAF achieves 98.69% accuracy on the real-world traceset that consists of 16 applications, supports the early prediction, and is robust to the scale of the training traceset. Besides, MAAF is superior to the state-of-the-art methods in terms of both accuracy and robustness.
We propose a new spam detection approach based solely on meta data features gained from email headers. The approach achieves above 99 % classification accuracy on the CSDMC2010 dataset, which matches or surpasses state-of-the-art spam classifiers. We utilize a static set of engineered features, supplemented with automatically extracted features. The approach is just as effective for spam detection in end-to-end encryption, as our feature set remains unchanged for encrypted emails. In contrast to most established spam detectors, we disregard the email body completely and can therefore deliver very high classification speeds, as computationally expensive text preprocessing is not necessary.
The current authentication systems based on password and pin code are not enough to guarantee attacks from malicious users. For this reason, in the last years, several studies are proposed with the aim to identify the users basing on their typing dynamics. In this paper, we propose a deep neural network architecture aimed to discriminate between different users using a set of keystroke features. The idea behind the proposed method is to identify the users silently and continuously during their typing on a monitored system. To perform such user identification effectively, we propose a feature model able to capture the typing style that is specific to each given user. The proposed approach is evaluated on a large dataset derived by integrating two real-world datasets from existing studies. The merged dataset contains a total of 1530 different users each writing a set of different typing samples. Several deep neural networks, with an increasing number of hidden layers and two different sets of features, are tested with the aim to find the best configuration. The final best classifier scores a precision equal to 0.997, a recall equal to 0.99 and an accuracy equal to 99% using an MLP deep neural network with 9 hidden layers. Finally, the performances obtained by using the deep learning approach are also compared with the performance of traditional decision-trees machine learning algorithm, attesting the effectiveness of the deep learning-based classifiers in the domain of keystroke analysis.
In order to examine malicious activity that occurs in a network or a system, intrusion detection system is used. Intrusion Detection is software or a device that scans a system or a network for a distrustful activity. Due to the growing connectivity between computers, intrusion detection becomes vital to perform network security. Various machine learning techniques and statistical methodologies have been used to build different types of Intrusion Detection Systems to protect the networks. Performance of an Intrusion Detection is mainly depends on accuracy. Accuracy for Intrusion detection must be enhanced to reduce false alarms and to increase the detection rate. In order to improve the performance, different techniques have been used in recent works. Analyzing huge network traffic data is the main work of intrusion detection system. A well-organized classification methodology is required to overcome this issue. This issue is taken in proposed approach. Machine learning techniques like Support Vector Machine (SVM) and Naïve Bayes are applied. These techniques are well-known to solve the classification problems. For evaluation of intrusion detection system, NSL- KDD knowledge discovery Dataset is taken. The outcomes show that SVM works better than Naïve Bayes. To perform comparative analysis, effective classification methods like Support Vector Machine and Naive Bayes are taken, their accuracy and misclassification rate get calculated.
Dendritic cell algorithm (DCA) is an immune-inspired classification algorithm which is developed for the purpose of anomaly detection in computer networks. The DCA uses a weighted function in its context detection phase to process three categories of input signals including safe, danger and pathogenic associated molecular pattern to three output context values termed as co-stimulatory, mature and semi-mature, which are then used to perform classification. The weighted function used by the DCA requires either manually pre-defined weights usually provided by the immunologists, or empirically derived weights from the training dataset. Neither of these is sufficiently flexible to work with different datasets to produce optimum classification result. To address such limitation, this work proposes an approach for computing the three output context values of the DCA by employing the recently proposed TSK+ fuzzy inference system, such that the weights are always optimal for the provided data set regarding a specific application. The proposed approach was validated and evaluated by applying it to the two popular datasets KDD99 and UNSW NB15. The results from the experiments demonstrate that, the proposed approach outperforms the conventional DCA in terms of classification accuracy.