Biblio
Whenever any internet user visits a website, a scripting language runs in the background known as JavaScript. The embedding of malicious activities within the script poses a great threat to the cyberworld. Attackers take advantage of the dynamic nature of the JavaScript and embed malicious code within the website to download malware and damage the host. JavaScript developers obfuscate the script to keep it shielded from getting detected by the malware detectors. In this paper, we propose a novel technique for analysing and detecting JavaScript using sandbox assisted ensemble model. We extract the payload using malware-jail sandbox to get the real script. Upon getting the extracted script, we analyse it to define the features that are needed for creating the dataset. We compute Pearson's r between every feature for feature extraction. An ensemble model consisting of Sequential Minimal Optimization (SMO), Voted Perceptron and AdaBoost algorithm is used with voting technique to detect malicious JavaScript. Experimental results show that our proposed model can detect obfuscated and de-obfuscated malicious JavaScript with an accuracy of 99.6% and 0.03s detection time. Our model performs better than other state-of-the-art models in terms of accuracy and least training and detection time.
Machine learning is a major area in artificial intelligence, which enables computer to learn itself explicitly without programming. As machine learning is widely used in making decision automatically, attackers have strong intention to manipulate the prediction generated my machine learning model. In this paper we study about the different types of attacks and its countermeasures on machine learning model. By research we found that there are many security threats in various algorithms such as K-nearest-neighbors (KNN) classifier, random forest, AdaBoost, support vector machine (SVM), decision tree, we revisit existing security threads and check what are the possible countermeasures during the training and prediction phase of machine learning model. In machine learning model there are 2 types of attacks that is causative attack which occurs during the training phase and exploratory attack which occurs during the prediction phase, we will also discuss about the countermeasures on machine learning model, the countermeasures are data sanitization, algorithm robustness enhancement, and privacy preserving techniques.
Most anti-phishing solutions that exist today require scanning a large portion of the web, which is vast and equivalent to finding a needle in a haystack. In addition, such solutions are not very efficient. We propose a different approach. Our solution does not rely on the scanning of the entire Internet or a large portion of it and only needs access to the brand's traffic in order to be able to detect phishing attempts against that brand. By analyzing a sample of phishing websites, we find features that can be used to distinguish phishing websites from the legitimate ones. We then use these features to train a machine learning classifier capable of helping brands detect phishing attempts against them. Our approach can detect up to 86% of phishing attacks against the brands and is best used as a complementary tool to the existing anti-phishing solutions.
With the rapid development of the information technology, more and more high-speed networks came out. The 4G LTE network as a recently emerging network has gradually entered the mainstream of the communication network. This paper proposed an effective content-based information filtering based on the 4G LTE high-speed network by combing the content-based filter and traditional simple filter. Firstly, raw information is pre-processed by five-tuple filter. Secondly, we determine the topics and character of the source data by key nearest neighbor text classification after minimum-risk Bayesian classification. Finally, the improved AdaBoost algorithm achieves the four-level content-based information filtering. The experiments reveal that the effective information filtering method can be applied to the network security, big data analysis and other fields. It has high research value and market value.
This paper proposes and describes an active authentication model based on user profiles built from user-issued commands when interacting with GUI-based application. Previous behavioral models derived from user issued commands were limited to analyzing the user's interaction with the *Nix (Linux or Unix) command shell program. Human-computer interaction (HCI) research has explored the idea of building users profiles based on their behavioral patterns when interacting with such graphical interfaces. It did so by analyzing the user's keystroke and/or mouse dynamics. However, none had explored the idea of creating profiles by capturing users' usage characteristics when interacting with a specific application beyond how a user strikes the keyboard or moves the mouse across the screen. We obtain and utilize a dataset of user command streams collected from working with Microsoft (MS) Word to serve as a test bed. User profiles are first built using MS Word commands and identification takes place using machine learning algorithms. Best performance in terms of both accuracy and Area under the Curve (AUC) for Receiver Operating Characteristic (ROC) curve is reported using Random Forests (RF) and AdaBoost with random forests.