Biblio
DNS based domain name resolution has been known as one of the most fundamental Internet services. In the meanwhile, DNS cache poisoning attacks also have become a critical threat in the cyber world. In addition to Kaminsky attacks, the falsified data from the compromised authoritative DNS servers also have become the threats nowadays. Several solutions have been proposed in order to prevent DNS cache poisoning attacks in the literature for the former case such as DNSSEC (DNS Security Extensions), however no effective solutions have been proposed for the later case. Moreover, due to the performance issue and significant workload increase on DNS cache servers, DNSSEC has not been deployed widely yet. In this work, we propose an advanced detection method against DNS cache poisoning attacks using machine learning techniques. In the proposed method, in addition to the basic 5-tuple information of a DNS packet, we intend to add a lot of special features extracted based on the standard DNS protocols as well as the heuristic aspects such as “time related features”, “GeoIP related features” and “trigger of cached DNS data”, etc., in order to identify the DNS response packets used for cache poisoning attacks especially those from compromised authoritative DNS servers. In this paper, as a work in progress, we describe the basic idea and concept of our proposed method as well as the intended network topology of the experimental environment while the prototype implementation, training data preparation and model creation as well as the evaluations will belong to the future work.
With the tremendous growth of IoT botnet DDoS attacks in recent years, IoT security has now become one of the most concerned topics in the field of network security. A lot of security approaches have been proposed in the area, but they still lack in terms of dealing with newer emerging variants of IoT malware, known as Zero-Day Attacks. In this paper, we present a honeypot-based approach which uses machine learning techniques for malware detection. The IoT honeypot generated data is used as a dataset for the effective and dynamic training of a machine learning model. The approach can be taken as a productive outset towards combatting Zero-Day DDoS Attacks which now has emerged as an open challenge in defending IoT against DDoS Attacks.
Reconnaissance phase is where attackers identify their targets and how to collect information from professional social networks which can be used to select and exploit targeted employees to penetrate in an organization. Here, a framework is proposed for the early detection of attackers in the reconnaissance phase, highlighting the common characteristic behavior among attackers in professional social networks. And to create artificial honeypot profiles within the organizational social network which can be used to detect a potential incoming threat. By analyzing the dataset of social Network profiles in combination of machine learning techniques, A DspamRPfast model is proposed for the creation of a classifier system to predict the probabilities of the profiles being fake or malicious and to filter them out using XGBoost and for the faster classification and greater accuracy of 84.8%.
The increasing publication of large amounts of data, theoretically anonymous, can lead to a number of attacks on the privacy of people. The publication of sensitive data without exposing the data owners is generally not part of the software developers concerns. The regulations for the data privacy-preserving create an appropriate scenario to focus on privacy from the perspective of the use or data exploration that takes place in an organization. The increasing number of sanctions for privacy violations motivates the systematic comparison of three known machine learning algorithms in order to measure the usefulness of the data privacy preserving. The scope of the evaluation is extended by comparing them with a known privacy preservation metric. Different parameter scenarios and privacy levels are used. The use of publicly available implementations, the presentation of the methodology, explanation of the experiments and the analysis allow providing a framework of work on the problem of the preservation of privacy. Problems are shown in the measurement of the usefulness of the data and its relationship with the privacy preserving. The findings motivate the need to create optimized metrics on the privacy preferences of the owners of the data since the risks of predicting sensitive attributes by means of machine learning techniques are not usually eliminated. In addition, it is shown that there may be a hundred percent, but it cannot be measured. As well as ensuring adequate performance of machine learning models that are of interest to the organization that data publisher.
OS kernel is the core part of the operating system, and it plays an important role for OS resource management. A popular way to compromise OS kernel is through a kernel rootkit (i.e., malicious kernel module). Once a rootkit is loaded into the kernel space, it can carry out arbitrary malicious operations with high privilege. To defeat kernel rootkits, many approaches have been proposed in the past few years. However, existing methods suffer from some limitations: 1) most methods focus on user-mode rootkit detection; 2) some methods are limited to detect obfuscated kernel modules; and 3) some methods introduce significant performance overhead. To address these problems, we propose VKRD, a kernel rootkit detection system based on the hardware assisted virtualization technology. Compared with previous methods, VKRD can provide a transparent and an efficient execution environment for the target kernel module to reveal its run-time behavior. To select the important run-time features for training our detection models, we utilize the TF-IDF method. By combining the hardware assisted virtualization and machine learning techniques, our kernel rootkit detection solution could be potentially applied in the cloud environment. The experiments show that our system can detect windows kernel rootkits with high accuracy and moderate performance cost.
Proper evaluation of classifier predictive models requires the selection of appropriate metrics to gauge the effectiveness of a model's performance. The Area Under the Receiver Operating Characteristic Curve (AUC) has become the de facto standard metric for evaluating this classifier performance. However, recent studies have suggested that AUC is not necessarily the best metric for all types of datasets, especially those in which there exists a high or severe level of class imbalance. There is a need to assess which specific metrics are most beneficial to evaluate the performance of highly imbalanced big data. In this work, we evaluate the performance of eight machine learning techniques on a severely imbalanced big dataset pertaining to the cyber security domain. We analyze the behavior of six different metrics to determine which provides the best representation of a model's predictive performance. We also evaluate the impact that adjusting the classification threshold has on our metrics. Our results find that the C4.5N decision tree is the optimal learner when evaluating all presented metrics for severely imbalanced Slow HTTP DoS attack data. Based on our results, we propose that the use of AUC alone as a primary metric for evaluating highly imbalanced big data may be ineffective, and the evaluation of metrics such as F-measure and Geometric mean can offer substantial insight into the true performance of a given model.
In today's time Software Defined Network (SDN) gives the complete control to get the data flow in the network. SDN works as a central point to which data is administered centrally and traffic is also managed. SDN being open source product is more prone to security threats. The security policies are also to be enforced as it would otherwise let the controller be attacked the most. The attacks like DDOS and DOS attacks are more commonly found in SDN controller. DDOS is destructive attack that normally diverts the normal flow of traffic and starts the over flow of flooded packets halting the system. Machine Learning techniques helps to identify the hidden and unexpected pattern of the network and hence helps in analyzing the network flow. All the classified and unclassified techniques can help detect the malicious flow based on certain parameters like packet flow, time duration, accuracy and precision rate. Researchers have used Bayesian Network, Wavelets, Support Vector Machine and KNN to detect DDOS attacks. As per the review it's been analyzed that KNN produces better result as per the higher precision and giving a lower falser rate for detection. This paper produces better approach of hybrid Machine Learning techniques rather than existing KNN on the same data set giving more accuracy of detecting DDOS attacks on higher precision rate. The result of the traffic with both normal and abnormal behavior is shown and as per the result the proposed algorithm is designed which is suited for giving better approach than KNN and will be implemented later on for future.
Widespread use of Wireless Sensor Networks (WSNs) introduced many security threats due to the nature of such networks, particularly limited hardware resources and infrastructure less nature. Denial of Service attack is one of the most common types of attacks that face such type of networks. Building an Intrusion Detection and Prevention System to mitigate the effect of Denial of Service attack is not an easy task. This paper proposes the use of two machine learning techniques, namely decision trees and Support Vector Machines, to detect attack signature on a specialized dataset. The used dataset contains regular profiles and several Denial of Service attack scenarios in WSNs. The experimental results show that decision trees technique achieved better (higher) true positive rate and better (lower) false positive rate than Support Vector Machines, 99.86% vs 99.62%, and 0.05% vs. 0.09%, respectively.
Intrusion detection is one essential tool towards building secure and trustworthy Cloud computing environment, given the ubiquitous presence of cyber attacks that proliferate rapidly and morph dynamically. In our current working paradigm of resource, platform and service consolidations, Cloud Computing provides a significant improvement in the cost metrics via dynamic provisioning of IT services. Since almost all cloud computing networks lean on providing their services through Internet, they are prone to experience variety of security issues. Therefore, in cloud environments, it is necessary to deploy an Intrusion Detection System (IDS) to detect new and unknown attacks in addition to signature based known attacks, with high accuracy. In our deliberation we assume that a system or a network ``anomalous'' event is synonymous to an ``intrusion'' event when there is a significant departure in one or more underlying system or network activities. There are couple of recently proposed ideas that aim to develop a hybrid detection mechanism, combining advantages of signature-based detection schemes with the ability to detect unknown attacks based on anomalies. In this work, we propose a network based anomaly detection system at the Cloud Hypervisor level that utilizes a hybrid algorithm: a combination of K-means clustering algorithm and SVM classification algorithm, to improve the accuracy of the anomaly detection system. Dataset from UNSW-NB15 study is used to evaluate the proposed approach and results are compared with previous studies. The accuracy for our proposed K-means clustering model is slightly higher than others. However, the accuracy we obtained from the SVM model is still low for supervised techniques.
The enormous growth of Internet-based traffic exposes corporate networks with a wide variety of vulnerabilities. Intrusive traffics are affecting the normal functionality of network's operation by consuming corporate resources and time. Efficient ways of identifying, protecting, and mitigating from intrusive incidents enhance productivity. As Intrusion Detection System (IDS) is hosted in the network and at the user machine level to oversee the malicious traffic in the network and at the individual computer, it is one of the critical components of a network and host security. Unsupervised anomaly traffic detection techniques are improving over time. This research aims to find an efficient classifier that detects anomaly traffic from NSL-KDD dataset with high accuracy level and minimal error rate by experimenting with five machine learning techniques. Five binary classifiers: Stochastic Gradient Decent, Random Forests, Logistic Regression, Support Vector Machine, and Sequential Model are tested and validated to produce the result. The outcome demonstrates that Random Forest Classifier outperforms the other four classifiers with and without applying the normalization process to the dataset.
At a time when all it takes to open a Twitter account is a mobile phone, the act of authenticating information encountered on social media becomes very complex, especially when we lack measures to verify digital identities in the first place. Because the platform supports anonymity, fake news generated by dubious sources have been observed to travel much faster and farther than real news. Hence, we need valid measures to identify authors of misinformation to avert these consequences. Researchers propose different authorship attribution techniques to approach this kind of problem. However, because tweets are made up of only 280 characters, finding a suitable authorship attribution technique is a challenge. This research aims to classify authors of tweets by comparing machine learning methods like logistic regression and naive Bayes. The processes of this application are fetching of tweets, pre-processing, feature extraction, and developing a machine learning model for classification. This paper illustrates the text classification for authorship process using machine learning techniques. In total, there were 46,895 tweets used as both training and testing data, and unique features specific to Twitter were extracted. Several steps were done in the pre-processing phase, including removal of short texts, removal of stop-words and punctuations, tokenizing and stemming of texts as well. This approach transforms the pre-processed data into a set of feature vector in Python. Logistic regression and naive Bayes algorithms were applied to the set of feature vectors for the training and testing of the classifier. The logistic regression based classifier gave the highest accuracy of 91.1% compared to the naive Bayes classifier with 89.8%.
This is very true for the Windows operating system (OS) used by government and private organizations. With Windows, the closed source nature of the operating system has unfortunately meant that hidden security issues are discovered very late and the fixes are not found in real time. There needs to be a reexamination of current static methods of malware detection. This paper presents an integrated system for automated and real-time monitoring and prediction of rootkit and malware threats for the Windows OS. We propose to host the target Windows machines on the widely used Xen hypervisor, and collect process behavior using virtual memory introspection (VMI). The collected data will be analyzed using state of the art machine learning techniques to quickly isolate malicious process behavior and alert system administrators about potential cyber breaches. This research has two focus areas: identifying memory data structures and developing prediction tools to detect malware. The first part of research focuses on identifying memory data structures affected by malware. This includes extracting the kernel data structures with VMI that are frequently targeted by rootkits/malware. The second part of the research will involve development of a prediction tool using machine learning techniques.
In this paper, we report our work on using machine learning techniques to predict back bending activity based on field data acquired in a local nursing home. The data are recorded by a privacy-aware compliance tracking system (PACTS). The objective of PACTS is to detect back-bending activities and issue real-time alerts to the participant when she bends her back excessively, which we hope could help the participant form good habits of using proper body mechanics when performing lifting/pulling tasks. We show that our algorithms can differentiate nursing staffs baseline and high-level bending activities by using human skeleton data without any expert rules.
Embry-Riddle Aeronautical University (ERAU) is working with the Air Force Research Lab (AFRL) to develop a distributed multi-layer autonomous UAS planning and control technology for gathering intelligence in Anti-Access Area Denial (A2/AD) environments populated by intelligent adaptive adversaries. These resilient autonomous systems are able to navigate through hostile environments while performing Intelligence, Surveillance, and Reconnaissance (ISR) tasks, and minimizing the loss of assets. Our approach incorporates artificial life concepts, with a high-level architecture divided into three biologically inspired layers: cyber-physical, reactive, and deliberative. Each layer has a dynamic level of influence over the behavior of the agent. Algorithms within the layers act on a filtered view of reality, abstracted in the layer immediately below. Each layer takes input from the layer below, provides output to the layer above, and provides direction to the layer below. Fast-reactive control systems in lower layers ensure a stable environment supporting cognitive function on higher layers. The cyber-physical layer represents the central nervous system of the individual, consisting of elements of the vehicle that cannot be changed such as sensors, power plant, and physical configuration. On the reactive layer, the system uses an artificial life paradigm, where each agent interacts with the environment using a set of simple rules regarding wants and needs. Information is communicated explicitly via message passing and implicitly via observation and recognition of behavior. In the deliberative layer, individual agents look outward to the group, deliberating on efficient resource management and cooperation with other agents. Strategies at all layers are developed using machine learning techniques such as Genetic Algorithm (GA) or NN applied to system training that takes place prior to the mission.
Cloud computing is a revolution in IT technology that provides scalable, virtualized on-demand resources to the end users with greater flexibility, less maintenance and reduced infrastructure cost. These resources are supervised by different management organizations and provided over Internet using known networking protocols, standards and formats. The underlying technologies and legacy protocols contain bugs and vulnerabilities that can open doors for intrusion by the attackers. Attacks as DDoS (Distributed Denial of Service) are ones of the most frequent that inflict serious damage and affect the cloud performance. In a DDoS attack, the attacker usually uses innocent compromised computers (called zombies) by taking advantages of known or unknown bugs and vulnerabilities to send a large number of packets from these already-captured zombies to a server. This may occupy a major portion of network bandwidth of the victim cloud infrastructures or consume much of the servers time. Thus, in this work, we designed a DDoS detection system based on the C.4.5 algorithm to mitigate the DDoS threat. This algorithm, coupled with signature detection techniques, generates a decision tree to perform automatic, effective detection of signatures attacks for DDoS flooding attacks. To validate our system, we selected other machine learning techniques and compared the obtained results.