Biblio
In the last couple of years, the move to cyberspace provides a fertile environment for ransomware criminals like ever before. Notably, since the introduction of WannaCry, numerous ransomware detection solution has been proposed. However, the ransomware incidence report shows that most organizations impacted by ransomware are running state of the art ransomware detection tools. Hence, an alternative solution is an urgent requirement as the existing detection models are not sufficient to spot emerging ransomware treat. With this motivation, our work proposes "DeepGuard," a novel concept of modeling user behavior for ransomware detection. The main idea is to log the file-interaction pattern of typical user activity and pass it through deep generative autoencoder architecture to recreate the input. With sufficient training data, the model can learn how to reconstruct typical user activity (or input) with minimal reconstruction error. Hence, by applying the three-sigma limit rule on the model's output, DeepGuard can distinguish the ransomware activity from the user activity. The experiment result shows that DeepGuard effectively detects a variant class of ransomware with minimal false-positive rates. Overall, modeling the attack detection with user-behavior permits the proposed strategy to have deep visibility of various ransomware families.
In this study, it was aimed to recognize the emotional state from facial images using the deep learning method. In the study, which was approved by the ethics committee, a custom data set was created using videos taken from 20 male and 20 female participants while simulating 7 different facial expressions (happy, sad, surprised, angry, disgusted, scared, and neutral). Firstly, obtained videos were divided into image frames, and then face images were segmented using the Haar library from image frames. The size of the custom data set obtained after the image preprocessing is more than 25 thousand images. The proposed convolutional neural network (CNN) architecture which is mimics of LeNet architecture has been trained with this custom dataset. According to the proposed CNN architecture experiment results, the training loss was found as 0.0115, the training accuracy was found as 99.62%, the validation loss was 0.0109, and the validation accuracy was 99.71%.
As a part of body language, facial expression is a psychological state that reflects the current emotional state of the person. Recognition of facial expressions can help to understand others and enhance communication with others. We propose a facial expression recognition method based on convolutional neural network ensemble learning in this paper. Our model is composed of three sub-networks, and uses the SVM classifier to Integrate the output of the three networks to get the final result. The recognition accuracy of the model's expression on the FER2013 dataset reached 71.27%. The results show that the method has high test accuracy and short prediction time, and can realize real-time, high-performance facial recognition.
Machine learning techniques help to understand underlying patterns in datasets to develop defense mechanisms against cyber attacks. Multilayer Perceptron (MLP) technique is a machine learning technique used in detecting attack vs. benign data. However, it is difficult to construct any effective model when there are imbalances in the dataset that prevent proper classification of attack samples in data. In this research, we use UGR'16 dataset to conduct data wrangling initially. This technique helps to prepare a test set from the original dataset to train the neural network model effectively. We experimented with a series of inputs of varying sizes (i.e. 10000, 50000, 1 million) to observe the performance of the MLP neural network model with distribution of features over accuracy. Later, we use Generative Adversarial Network (GAN) model that produces samples of different attack labels (e.g. blacklist, anomaly spam, ssh scan) for balancing the dataset. These samples are generated based on data from the UGR'16 dataset. Further experiments with MLP neural network model shows that a balanced attack sample dataset, made possible with GAN, produces more accurate results than an imbalanced one.
Modern industrial control systems (ICS) act as victims of cyber attacks more often in last years. These cyber attacks often can not be detected by classical information security methods. Moreover, the consequences of cyber attack's impact can be catastrophic. Since cyber attacks leads to appearance of anomalies in the ICS and technological equipment controlled by it, the task of intrusion detection for ICS can be reformulated as the task of industrial process anomaly detection. This paper considers the applicability of generative adversarial networks (GANs) in the field of industrial processes anomaly detection. Existing approaches for GANs usage in the field of information security (such as anomaly detection in network traffic) were described. It is proposed to use the BiGAN architecture in order to detect anomalies in the industrial processes. The proposed approach has been tested on Secure Water Treatment Dataset (SWaT). The obtained results indicate the prospects of using the examined method in practice.
Phishing attacks are the most common form of attacks that can happen over the internet. This method involves attackers attempting to collect data of a user without his/her consent through emails, URLs, and any other link that leads to a deceptive page where a user is persuaded to commit specific actions that can lead to the successful completion of an attack. These attacks can allow an attacker to collect vital information of the user that can often allow the attacker to impersonate the victim and get things done that only the victim should have been able to do, such as carry out transactions, or message someone else, or simply accessing the victim's data. Many studies have been carried out to discuss possible approaches to prevent such attacks. This research work includes three machine learning algorithms to predict any websites' phishing status. In the experimentation these models are trained using URL based features and attempted to prevent Zero-Day attacks by using proposed software proposal that differentiates the legitimate websites and phishing websites by analyzing the website's URL. From observations, the random forest classifier performed with a precision of 97%, a recall 99%, and F1 Score is 97%. Proposed model is fast and efficient as it only works based on the URL and it does not use other resources for analysis, as was the case for past studies.
It is now possible to synthesize highly realistic images of people who do not exist. Such content has, for example, been implicated in the creation of fraudulent socialmedia profiles responsible for dis-information campaigns. Significant efforts are, therefore, being deployed to detect synthetically-generated content. One popular forensic approach trains a neural network to distinguish real from synthetic content.We show that such forensic classifiers are vulnerable to a range of attacks that reduce the classifier to near- 0% accuracy. We develop five attack case studies on a state- of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators, when only trained on one generator. With full access to the classifier, we can flip the lowest bit of each pixel in an image to reduce the classifier's AUC to 0.0005; perturb 1% of the image area to reduce the classifier's AUC to 0.08; or add a single noise pattern in the synthesizer's latent space to reduce the classifier's AUC to 0.17. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22. These attacks reveal significant vulnerabilities of certain image-forensic classifiers.
We leverage deep learning algorithms on various user behavioral information gathered from end-user devices to classify a subject of interest. In spite of the ability of these techniques to counter spoofing threats, they are vulnerable to adversarial learning attacks, where an attacker adds adversarial noise to the input samples to fool the classifier into false acceptance. Recently, a handful of mature techniques like Fast Gradient Sign Method (FGSM) have been proposed to aid white-box attacks, where an attacker has a complete knowledge of the machine learning model. On the contrary, we exploit a black-box attack to a behavioral biometric system based on gait patterns, by using FGSM and training a shadow model that mimics the target system. The attacker has limited knowledge on the target model and no knowledge of the real user being authenticated, but induces a false acceptance in authentication. Our goal is to understand the feasibility of a black-box attack and to what extent FGSM on shadow models would contribute to its success. Our results manifest that the performance of FGSM highly depends on the quality of the shadow model, which is in turn impacted by key factors including the number of queries allowed by the target system in order to train the shadow model. Our experimentation results have revealed strong relationships between the shadow model and FGSM performance, as well as the effect of the number of FGSM iterations used to create an attack instance. These insights also shed light on deep-learning algorithms' model shareability that can be exploited to launch a successful attack.
In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdomains in Machine Learning are explored - image based classifiers and text transformers with GPT-2. With image classification, the focus is on exploring commonly deployed architectures and datasets available in popular public libraries. Using a single transformer architecture with multiple levels of parameters, text generation is explored by fine tuning off different datasets. Each dataset explored in image and text are distinguishable from one another. Diversity in text transformer outputs implies further research is needed to successfully classify architecture attribution in text domain.
With the advent of networking technologies and increasing network attacks, Intrusion Detection systems are apparently needed to stop attacks and malicious activities. Various frameworks and techniques have been developed to solve the problem of intrusion detection, still there is need for new frameworks as per the challenging scenario of enormous scale in data size and nature of attacks. Current IDS systems pose challenges on the throughput to work with high speed networks. In this paper we address the issue of high computational overhead of anomaly based IDS and propose the solution using discretization as a data preprocessing step which can drastically reduce the computation overhead. We propose method to provide near real time detection of attacks using only basic flow level features that can easily be extracted from network packets.