Data Driven Security Models and Analysis - April 2016
Public Audience
Purpose: To highlight project progress. Information is generally at a higher level which is accessible to the interested public. All information contained in the report (regions 1-3) is a Government Deliverable/CDRL.
PI(s): Ravi Iyer
Co-PI(s): Zbigniew Kalbarczyk and Adam Slagell
Researchers: Phuong Cao and Key-whan Chung
HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.
- Predictive security metrics - design, development, and validation
- Resilient architectures - in the end we want to use the metrics to achieve a measurable enhancement in system resiliency, i.e., the ability to withstand attacks
- Human behavior - data contain traces of the steps the attacker took, and hence inherently include some aspects of the human behavior (of both users and miscreants)
PUBLICATIONS
Papers published in this quarter as a result of this research. Include title, author(s), venue published/presented, and a short description or abstract. Identify which hard problem(s) the publication addressed. Papers that have not yet been published should be reported in region 2 below.
- P. Cao, E. C. Badger, Z. T. Kalbarczyk, R. K. Iyer, "A Framework for Generation, Replay, and Analysis of Real-World Attack Variants," in Symposium and Bootcamp on the Science of Security (HotSoS), Carnegie Mellon University, April 19-21, 2016.
- H. Lin, H. Alemzadeh, D. Chen, Z. Kalbarczyk, R. K. Iyer, "Safety-critical Cyber-physical Attacks: Analysis, Detection, and Mitigation" in Symposium and Bootcamp on the Science of Security (HotSoS), Carnegie Mellon University, April 19-21, 2016.
ACCOMPLISHMENT HIGHLIGHTS
This quarter we have continued our work on building a security testbed that provides an execution platform for replaying security attacks in a controlled environment. Specifically, we designed and executed experiments to: i) test our methods for generating variants of known attacks and ii) evaluate detection capabilities of several detection techniques against such attack variants.
A framework for generation, replay, and analysis of real-world attack variants. Detection of persistent, multi-stage attacks is challenging. Advanced attackers can create an attack variant that achieves the same objective of a known attack while bypassing the existing detection mechanisms, e.g., the attack variant can use a covert channel (e.g., Internet Relay Chat (IRC) or the Domain Name System (DNS) exfiltration technique), rather than HTTP, to download code necessary for compromising the system. We must be able to anticipate such attacks. In order to study attack variants and facilitate testing of protection mechanisms we developed (and integrated with our security testbed) a framework to facilitate: (i) generation of attack variants, (ii) replay the created attack variants in an isolated environment, and (iii) evaluate detection capabilities of different techniques. Specific accomplishments include:
- Development of a procedure for generating attack variants that aims to achieve the same objectives as the original attacks. An attack variant is represented by an event sequence (corresponding to attacker actions), in which some events in the event sequence of the original attack are substituted by equivalent events. A database of interchangeable events was manually constructed based on domain knowledge of the events present on a target system. Given a sequence of events in an attack, events in this sequence are repeatedly replaced with interchangeable events to generate new sequences, which represent attack variants.
- Development of a prototype of an attack replay framework to facilitate replay of attacks and their variants in a controlled environment, i.e., a testbed. Each attack is packaged into an attack container, which contains preinstalled vulnerable software, and host and network security monitors. An attack is replayed by executing a sequence of programs, such as exploit code or vulnerable software, in the attack containers, which results in security events or alerts being generated by network and host security monitors. When an event is observed by a security monitor, the event is routed to an attack-detection backend, where different attack detection techniques can be evaluate
- Evaluation of the framework on three real-world attacks for which we generated a total of 648 unique attack variants (either corresponding to other known attacks, or new (unknown) possible attacks that might happen in the future). We evaluated the detection efficiency of the following techniques (in order of increasing sophistication): (1) signature-based detection, using a file hash of known malicious files; (2) anomaly-based detection, using high-frequency events observed in past attacks as an indicator of future attacks; and (3) detection based on probabilistic graphical models (e.g., factor graphs) that capture relationships between multiple events generated by security-monitoring tools to enhance detection efficiency. The results show that factor graph analysis could detect more than half of the attack variants (up to 75%), whereas the signature-based approach detected 25%, and the frequency-based approach detected up to 33%. That indicates that simple techniques such as signature-based techniques cannot detect the majority of the variants, whereas more sophisticated techniques, such as factor graph analysis, are less sensitive to attack variants.
The proposed framework is being experimented at the National Center for Supercomputing Applications. The generated variants provide an additional dimension for assessing the efficacy of various detection techniques.