Biblio
Many aspects of our daily lives now rely on computers, including communications, transportation, government, finance, medicine, and education. However, with increased dependence comes increased vulnerability. Therefore recognizing attacks quickly is critical. In this paper, we introduce a new anomaly detection algorithm based on persistent homology, a tool which computes summary statistics of a manifold. The idea is to represent a cyber network with a dynamic point cloud and compare the statistics over time. The robustness of persistent homology makes for a very strong comparison invariant.
Given a stream of heterogeneous graphs containing different types of nodes and edges, how can we spot anomalous ones in real-time while consuming bounded memory? This problem is motivated by and generalizes from its application in security to host-level advanced persistent threat (APT) detection. We propose StreamSpot, a clustering based anomaly detection approach that addresses challenges in two key fronts: (1) heterogeneity, and (2) streaming nature. We introduce a new similarity function for heterogeneous graphs that compares two graphs based on their relative frequency of local substructures, represented as short strings. This function lends itself to a vector representation of a graph, which is (a) fast to compute, and (b) amenable to a sketched version with bounded size that preserves similarity. StreamSpot exhibits desirable properties that a streaming application requires: it is (i) fully-streaming; processing the stream one edge at a time as it arrives, (ii) memory-efficient; requiring constant space for the sketches and the clustering, (iii) fast; taking constant time to update the graph sketches and the cluster summaries that can process over 100,000 edges per second, and (iv) online; scoring and flagging anomalies in real time. Experiments on datasets containing simulated system-call flow graphs from normal browser activity and various attack scenarios (ground truth) show that StreamSpot is high-performance; achieving above 95% detection accuracy with small delay, as well as competitive time and memory usage.
This paper presents an unsupervised method for systematically identifying anomalies in music datasets. The model integrates categorical regression and robust estimation techniques to infer anomalous scores in music clips. When applied to a music genre recognition dataset, the new method is able to detect corrupted, distorted, or mislabeled audio samples based on commonly used features in music information retrieval. The evaluation results show that the algorithm outperforms other anomaly detection methods and is capable of finding problematic samples identified by human experts. The proposed method introduces a preliminary framework for anomaly detection in music data that can serve as a useful tool to improve data integrity in the future.
In settings where data instances are generated sequentially or in streaming fashion, online learning algorithms can learn predictors using incremental training algorithms such as stochastic gradient descent. In some security applications such as training anomaly detectors, the data streams may consist of private information or transactions and the output of the learning algorithms may reveal information about the training data. Differential privacy is a framework for quantifying the privacy risk in such settings. This paper proposes two differentially private strategies to mitigate privacy risk when training a classifier for anomaly detection in an online setting. The first is to use a randomized active learning heuristic to screen out uninformative data points in the stream. The second is to use mini-batching to improve classifier performance. Experimental results show how these two strategies can trade off privacy, label complexity, and generalization performance.
The term “Advanced Persistent Threat” refers to a well-organized, malicious group of people who launch stealthy attacks against computer systems of specific targets, such as governments, companies or military. The attacks themselves are long-lasting, difficult to expose and often use very advanced hacking techniques. Since they are advanced in nature, prolonged and persistent, the organizations behind them have to possess a high level of knowledge, advanced tools and competent personnel to execute them. The attacks are usually preformed in several phases - reconnaissance, preparation, execution, gaining access, information gathering and connection maintenance. In each of the phases attacks can be detected with different probabilities. There are several ways to increase the level of security of an organization in order to counter these incidents. First and foremost, it is necessary to educate users and system administrators on different attack vectors and provide them with knowledge and protection so that the attacks are unsuccessful. Second, implement strict security policies. That includes access control and restrictions (to information or network), protecting information by encrypting it and installing latest security upgrades. Finally, it is possible to use software IDS tools to detect such anomalies (e.g. Snort, OSSEC, Sguil).
Most network traffic analysis applications are designed to discover malicious activity by only relying on high-level flow-based message properties. However, to detect security breaches that are specifically designed to target one network (e.g., Advanced Persistent Threats), deep packet inspection and anomaly detection are indispensible. In this paper, we focus on how we can support experts in discovering whether anomalies at message level imply a security risk at network level. In SNAPS (Semantic Network traffic Analysis through Projection and Selection), we provide a bottom-up pixel-oriented approach for network traffic analysis where the expert starts with low-level anomalies and iteratively gains insight in higher level events through the creation of multiple selections of interest in parallel. The tight integration between visualization and machine learning enables the expert to iteratively refine anomaly scores, making the approach suitable for both post-traffic analysis and online monitoring tasks. To illustrate the effectiveness of this approach, we present example explorations on two real-world data sets for the detection and understanding of potential Advanced Persistent Threats in progress.
What you see is not definitely believable is not a rare case in the cyber security monitoring. However, due to various tricks of camouflages, such as packing or virutal private network (VPN), detecting "advanced persistent threat"(APT) by only signature based malware detection system becomes more and more intractable. On the other hand, by carefully modeling users' subsequent behaviors of daily routines, probability for one account to generate certain operations can be estimated and used in anomaly detection. To the best of our knowledge so far, a novel behavioral analytic framework, which is dedicated to analyze Active Directory domain service logs and to monitor potential inside threat, is now first proposed in this project. Experiments on real dataset not only show that the proposed idea indeed explores a new feasible direction for cyber security monitoring, but also gives a guideline on how to deploy this framework to various environments.
An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system. A networkbased system, or NIDS, the individual packets flowing through a network are analyzed. In a host-based system, the IDS examines at the activity on each individual computer or host. IDS techniques are divided into two categories including misuse detection and anomaly detection. In recently years, Mobile Agent based technology has been used for distributed systems with having characteristic of mobility and autonomy. In this working we aimed to combine IDS with Mobile Agent concept for more scale, effective, knowledgeable system.
Host-based anomaly intrusion detection system design is very challenging due to the notoriously high false alarm rate. This paper introduces a new host-based anomaly intrusion detection methodology using discontiguous system call patterns, in an attempt to increase detection rates whilst reducing false alarm rates. The key concept is to apply a semantic structure to kernel level system calls in order to reflect intrinsic activities hidden in high-level programming languages, which can help understand program anomaly behaviour. Excellent results were demonstrated using a variety of decision engines, evaluating the KDD98 and UNM data sets, and a new, modern data set. The ADFA Linux data set was created as part of this research using a modern operating system and contemporary hacking methods, and is now publicly available. Furthermore, the new semantic method possesses an inherent resilience to mimicry attacks, and demonstrated a high level of portability between different operating system versions.
Host-based anomaly intrusion detection system design is very challenging due to the notoriously high false alarm rate. This paper introduces a new host-based anomaly intrusion detection methodology using discontiguous system call patterns, in an attempt to increase detection rates whilst reducing false alarm rates. The key concept is to apply a semantic structure to kernel level system calls in order to reflect intrinsic activities hidden in high-level programming languages, which can help understand program anomaly behaviour. Excellent results were demonstrated using a variety of decision engines, evaluating the KDD98 and UNM data sets, and a new, modern data set. The ADFA Linux data set was created as part of this research using a modern operating system and contemporary hacking methods, and is now publicly available. Furthermore, the new semantic method possesses an inherent resilience to mimicry attacks, and demonstrated a high level of portability between different operating system versions.
Distributed Denial of Service (DDoS) attacks are one of the most important threads in network systems. Due to the distributed nature, DDoS attacks are very hard to detect, while they also have the destructive potential of classical denial of service attacks. In this study, a novel 2-step system is proposed for the detection of DDoS attacks. In the first step an anomaly detection is performed on the destination IP traffic. If an anomaly is detected on the network, the system proceeds into the second step where a decision on every user is made due to the behaviour models. Hence, it is possible to detect attacks in the network that diverges from users' behavior model.
Reduction of Quality (RoQ) attack is a stealthy denial of service attack. It can decrease or inhibit normal TCP flows in network. Victims are hard to perceive it as the final network throughput is decreasing instead of increasing during the attack. Therefore, the attack is strongly hidden and it is difficult to be detected by existing detection systems. Based on the principle of Time-Frequency analysis, we propose a two-stage detection algorithm which combines anomaly detection with misuse detection. In the first stage, we try to detect the potential anomaly by analyzing network traffic through Wavelet multiresolution analysis method. According to different time-domain characteristics, we locate the abrupt change points. In the second stage, we further analyze the local traffic around the abrupt change point. We extract the potential attack characteristics by autocorrelation analysis. By the two-stage detection, we can ultimately confirm whether the network is affected by the attack. Results of simulations and real network experiments demonstrate that our algorithm can detect RoQ attacks, with high accuracy and high efficiency.
This paper proposes a new network-based cyber intrusion detection system (NIDS) using multicast messages in substation automation systems (SASs). The proposed network-based intrusion detection system monitors anomalies and malicious activities of multicast messages based on IEC 61850, e.g., Generic Object Oriented Substation Event (GOOSE) and Sampled Value (SV). NIDS detects anomalies and intrusions that violate predefined security rules using a specification-based algorithm. The performance test has been conducted for different cyber intrusion scenarios (e.g., packet modification, replay and denial-of-service attacks) using a cyber security testbed. The IEEE 39-bus system model has been used for testing of the proposed intrusion detection method for simultaneous cyber attacks. The false negative ratio (FNR) is the number of misclassified abnormal packets divided by the total number of abnormal packets. The results demonstrate that the proposed NIDS achieves a low fault negative rate.
In this paper, an algorithm is proposed to automatically produce hierarchical graph-based representations of maritime shipping lanes extrapolated from historical vessel positioning data. Each shipping lane is generated based on the detection of the vessel behavioural changes and represented in a compact synthetic route composed of the network nodes and route segments. The outcome of the knowledge discovery process is a geographical maritime network that can be used in Maritime Situational Awareness (MSA) applications such as track reconstruction from missing information, situation/destination prediction, and detection of anomalous behaviour. Experimental results are presented, testing the algorithm in a specific scenario of interest, the Dover Strait.
Cyber intrusions to substations of a power grid are a source of vulnerability since most substations are unmanned and with limited protection of the physical security. In the worst case, simultaneous intrusions into multiple substations can lead to severe cascading events, causing catastrophic power outages. In this paper, an integrated Anomaly Detection System (ADS) is proposed which contains host- and network-based anomaly detection systems for the substations, and simultaneous anomaly detection for multiple substations. Potential scenarios of simultaneous intrusions into the substations have been simulated using a substation automation testbed. The host-based anomaly detection considers temporal anomalies in the substation facilities, e.g., user-interfaces, Intelligent Electronic Devices (IEDs) and circuit breakers. The malicious behaviors of substation automation based on multicast messages, e.g., Generic Object Oriented Substation Event (GOOSE) and Sampled Measured Value (SMV), are incorporated in the proposed network-based anomaly detection. The proposed simultaneous intrusion detection method is able to identify the same type of attacks at multiple substations and their locations. The result is a new integrated tool for detection and mitigation of cyber intrusions at a single substation or multiple substations of a power grid.
Cyber intrusions to substations of a power grid are a source of vulnerability since most substations are unmanned and with limited protection of the physical security. In the worst case, simultaneous intrusions into multiple substations can lead to severe cascading events, causing catastrophic power outages. In this paper, an integrated Anomaly Detection System (ADS) is proposed which contains host- and network-based anomaly detection systems for the substations, and simultaneous anomaly detection for multiple substations. Potential scenarios of simultaneous intrusions into the substations have been simulated using a substation automation testbed. The host-based anomaly detection considers temporal anomalies in the substation facilities, e.g., user-interfaces, Intelligent Electronic Devices (IEDs) and circuit breakers. The malicious behaviors of substation automation based on multicast messages, e.g., Generic Object Oriented Substation Event (GOOSE) and Sampled Measured Value (SMV), are incorporated in the proposed network-based anomaly detection. The proposed simultaneous intrusion detection method is able to identify the same type of attacks at multiple substations and their locations. The result is a new integrated tool for detection and mitigation of cyber intrusions at a single substation or multiple substations of a power grid.
Cyber systems play a critical role in improving the efficiency and reliability of power system operation and ensuring the system remains within safe operating margins. An adversary can inflict severe damage to the underlying physical system by compromising the control and monitoring applications facilitated by the cyber layer. Protection of critical assets from electronic threats has traditionally been done through conventional cyber security measures that involve host-based and network-based security technologies. However, it has been recognized that highly skilled attacks can bypass these security mechanisms to disrupt the smooth operation of control systems. There is a growing need for cyber-attack-resilient control techniques that look beyond traditional cyber defense mechanisms to detect highly skilled attacks. In this paper, we make the following contributions. We first demonstrate the impact of data integrity attacks on Automatic Generation Control (AGC) on power system frequency and electricity market operation. We propose a general framework to the application of attack resilient control to power systems as a composition of smart attack detection and mitigation. Finally, we develop a model-based anomaly detection and attack mitigation algorithm for AGC. We evaluate the detection capability of the proposed anomaly detection algorithm through simulation studies. Our results show that the algorithm is capable of detecting scaling and ramp attacks with low false positive and negative rates. The proposed model-based mitigation algorithm is also efficient in maintaining system frequency within acceptable limits during the attack period.
The popularity of mobile devices and the enormous number of third party mobile applications in the market have naturally lead to several vulnerabilities being identified and abused. This is coupled with the immaturity of intrusion detection system (IDS) technology targeting mobile devices. In this paper we propose a modular host-based IDS framework for mobile devices that uses behavior analysis to profile applications on the Android platform. Anomaly detection can then be used to categorize malicious behavior and alert users. The proposed system accommodates different detection algorithms, and is being tested at a major telecom operator in North America. This paper highlights the architecture, findings, and lessons learned.
Security issues are crucial in a number of machine learning applications, especially in scenarios dealing with human activity rather than natural phenomena (e.g., information ranking, spam detection, malware detection, etc.). In such cases, learning algorithms may have to cope with manipulated data aimed at hampering decision making. Although some previous work addressed the issue of handling malicious data in the context of supervised learning, very little is known about the behavior of anomaly detection methods in such scenarios. In this contribution, we analyze the performance of a particular method–online centroid anomaly detection–in the presence of adversarial noise. Our analysis addresses the following security-related issues: formalization of learning and attack processes, derivation of an optimal attack, and analysis of attack efficiency and limitations. We derive bounds on the effectiveness of a poisoning attack against centroid anomaly detection under different conditions: attacker's full or limited control over the traffic and bounded false positive rate. Our bounds show that whereas a poisoning attack can be effectively staged in the unconstrained case, it can be made arbitrarily difficult (a strict upper bound on the attacker's gain) if external constraints are properly used. Our experimental evaluation, carried out on real traces of HTTP and exploit traffic, confirms the tightness of our theoretical bounds and the practicality of our protection mechanisms.
In network intrusion detection research, one popular strategy for finding attacks is monitoring a network's activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. However, despite extensive academic research one finds a striking gap in terms of actual deployments of such systems: compared with other intrusion detection approaches, machine learning is rarely employed in operational "real world" settings. We examine the differences between the network intrusion detection problem and other areas where machine learning regularly finds much more success. Our main claim is that the task of finding attacks is fundamentally different from these other applications, making it significantly harder for the intrusion detection community to employ machine learning effectively. We support this claim by identifying challenges particular to network intrusion detection, and provide a set of guidelines meant to strengthen future research on anomaly detection.