Biblio
We leverage deep learning algorithms on various user behavioral information gathered from end-user devices to classify a subject of interest. In spite of the ability of these techniques to counter spoofing threats, they are vulnerable to adversarial learning attacks, where an attacker adds adversarial noise to the input samples to fool the classifier into false acceptance. Recently, a handful of mature techniques like Fast Gradient Sign Method (FGSM) have been proposed to aid white-box attacks, where an attacker has a complete knowledge of the machine learning model. On the contrary, we exploit a black-box attack to a behavioral biometric system based on gait patterns, by using FGSM and training a shadow model that mimics the target system. The attacker has limited knowledge on the target model and no knowledge of the real user being authenticated, but induces a false acceptance in authentication. Our goal is to understand the feasibility of a black-box attack and to what extent FGSM on shadow models would contribute to its success. Our results manifest that the performance of FGSM highly depends on the quality of the shadow model, which is in turn impacted by key factors including the number of queries allowed by the target system in order to train the shadow model. Our experimentation results have revealed strong relationships between the shadow model and FGSM performance, as well as the effect of the number of FGSM iterations used to create an attack instance. These insights also shed light on deep-learning algorithms' model shareability that can be exploited to launch a successful attack.
Machine learning is a major area in artificial intelligence, which enables computer to learn itself explicitly without programming. As machine learning is widely used in making decision automatically, attackers have strong intention to manipulate the prediction generated my machine learning model. In this paper we study about the different types of attacks and its countermeasures on machine learning model. By research we found that there are many security threats in various algorithms such as K-nearest-neighbors (KNN) classifier, random forest, AdaBoost, support vector machine (SVM), decision tree, we revisit existing security threads and check what are the possible countermeasures during the training and prediction phase of machine learning model. In machine learning model there are 2 types of attacks that is causative attack which occurs during the training phase and exploratory attack which occurs during the prediction phase, we will also discuss about the countermeasures on machine learning model, the countermeasures are data sanitization, algorithm robustness enhancement, and privacy preserving techniques.
With the tremendous growth of IoT botnet DDoS attacks in recent years, IoT security has now become one of the most concerned topics in the field of network security. A lot of security approaches have been proposed in the area, but they still lack in terms of dealing with newer emerging variants of IoT malware, known as Zero-Day Attacks. In this paper, we present a honeypot-based approach which uses machine learning techniques for malware detection. The IoT honeypot generated data is used as a dataset for the effective and dynamic training of a machine learning model. The approach can be taken as a productive outset towards combatting Zero-Day DDoS Attacks which now has emerged as an open challenge in defending IoT against DDoS Attacks.
Modern Energy Management Systems (EMS) are becoming increasingly complex in order to address the urgent issue of global energy consumption. These systems retrieve vital information from various Internet-connected resources in a smart grid to function effectively. However, relying on such resources results in them being susceptible to cyber attacks. Malicious actors can exploit the interconnections between the resources to perform nefarious tasks such as modifying critical firmware, sending bogus sensor data, or stealing sensitive information. To address this issue, we propose a novel framework that integrates PowerWatch, a solution that detects compromised devices in the smart grid with Cyber-secure Power Router (CSPR), a smart energy management system. The goal is to ascertain whether or not such a device has operated maliciously. To achieve this, PowerWatch utilizes a machine learning model that analyzes information from system and library call lists extracted from CSPR in order to detect malicious activity in the EMS. To test the efficacy of our framework, a number of unique attack scenarios were performed on a realistic testbed that comprises functional versions of CSPR and PowerWatch to monitor the electrical environment for suspicious activity. Our performance evaluation investigates the effectiveness of this first-of-its-kind merger and provides insight into the feasibility of developing future cybersecure EMS. The results of our experimental procedures yielded 100% accuracy for each of the attack scenarios. Finally, our implementation demonstrates that the integration of PowerWatch and CSPR is effective and yields minimal overhead to the EMS.
At a time when all it takes to open a Twitter account is a mobile phone, the act of authenticating information encountered on social media becomes very complex, especially when we lack measures to verify digital identities in the first place. Because the platform supports anonymity, fake news generated by dubious sources have been observed to travel much faster and farther than real news. Hence, we need valid measures to identify authors of misinformation to avert these consequences. Researchers propose different authorship attribution techniques to approach this kind of problem. However, because tweets are made up of only 280 characters, finding a suitable authorship attribution technique is a challenge. This research aims to classify authors of tweets by comparing machine learning methods like logistic regression and naive Bayes. The processes of this application are fetching of tweets, pre-processing, feature extraction, and developing a machine learning model for classification. This paper illustrates the text classification for authorship process using machine learning techniques. In total, there were 46,895 tweets used as both training and testing data, and unique features specific to Twitter were extracted. Several steps were done in the pre-processing phase, including removal of short texts, removal of stop-words and punctuations, tokenizing and stemming of texts as well. This approach transforms the pre-processed data into a set of feature vector in Python. Logistic regression and naive Bayes algorithms were applied to the set of feature vectors for the training and testing of the classifier. The logistic regression based classifier gave the highest accuracy of 91.1% compared to the naive Bayes classifier with 89.8%.
With an increase in targeted attacks such as advanced persistent threats (APTs), enterprise system defenders require comprehensive frameworks that allow them to collaborate and evaluate their defense systems against such attacks. MITRE has developed a framework which includes a database of different kill-chains, tactics, techniques, and procedures that attackers employ to perform these attacks. In this work, we leverage natural language processing techniques to extract attacker actions from threat report documents generated by different organizations and automatically classify them into standardized tactics and techniques, while providing relevant mitigation advisories for each attack. A naïve method to achieve this is by training a machine learning model to predict labels that associate the reports with relevant categories. In practice, however, sufficient labeled data for model training is not always readily available, so that training and test data come from different sources, resulting in bias. A naïve model would typically underperform in such a situation. We address this major challenge by incorporating an importance weighting scheme called bias correction that efficiently utilizes available labeled data, given threat reports, whose categories are to be automatically predicted. We empirically evaluated our approach on 18,257 real-world threat reports generated between year 2000 and 2018 from various computer security organizations to demonstrate its superiority by comparing its performance with an existing approach.
Malicious emails pose substantial threats to businesses. Whether it is a malware attachment or a URL leading to malware, exploitation or phishing, attackers have been employing emails as an effective way to gain a foothold inside organizations of all kinds. To combat email threats, especially targeted attacks, traditional signature- and rule-based email filtering as well as advanced sandboxing technology both have their own weaknesses. In this paper, we propose a predictive analysis approach that learns the differences between legit and malicious emails through static analysis, creates a machine learning model and makes detection and prediction on unseen emails effectively and efficiently. By comparing three different machine learning algorithms, our preliminary evaluation reveals that a Random Forests model performs the best.