Visible to the public Data Driven Security Models and Analysis - October 2016Conflict Detection Enabled

Public Audience
Purpose: To highlight project progress. Information is generally at a higher level which is accessible to the interested public. All information contained in the report (regions 1-3) is a Government Deliverable/CDRL.

PI(s): Ravi Iyer

Co-PI(s): Zbigniew Kalbarczyk and Adam Slagell

Researchers: Phuong Cao and Key-whan Chung

HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.

  • Predictive security metrics - design, development, and validation
  • Resilient architectures - in the end we want to use the metrics to achieve a measurable enhancement in system resiliency, i.e., the ability to withstand attacks
  • Human behavior - data contain traces of the steps the attacker took, and hence inherently include some aspects of the human behavior (of both users and miscreants)

PUBLICATIONS
Papers published in this quarter as a result of this research. Include title, author(s), venue published/presented, and a short description or abstract. Identify which hard problem(s) the publication addressed. Papers that have not yet been published should be reported in region 2 below.

[1] Cuong Pham, Zachary Estrada, Phuong Cao, Zbigniew Kalbarczyk, and Ravishankar Iyer, "Building Reliable and Secure Virtual Machines using Architectural Invariants", IEEE Security and Privacy Magazine, volume 12, issue 5, September - October 2014.

Abstract: Reliability and security tend to be treated separately because they appear orthogonal: reliability focuses on accidental failures, security on intentional attacks. Because of the apparent dissimilarity between the two, tools to detect and recover from different classes of failures and attacks are usually designed and implemented differently. So, integrating support for reliability and security in a single framework is a significant challenge. Here, we discuss how to address this challenge in the context of cloud computing, for which reliability and security are growing concerns. Because cloud deployments usually consist of commodity hardware and software, efficient monitoring is key to achieving resiliency. Although reliability and security monitoring might use different types of analytics, the same sensing infrastructure can provide inputs to monitoring modules.

[2] Cuong Pham, Zachary Estrada, Zbigniew Klabarczyk, and Ravishankar Iyer, "Reliability and Security Monitoring of Virtual Machines using Hardware Architectural Invariants", 44th International Conference on Dependable Systems and Networks, June 2014. William C. Carter Award for Best Paper based on PhD work and Best Paper Award voted by conference participants.

Abstract: This paper presents a solution that simultaneously addresses both reliability and security (RnS) in a monitoring framework. We identify the commonalities between reliability and security to guide the design of HyperTap, a hypervisor-level framework that efficiently supports both types of monitoring in virtualization environments. In HyperTap, the logging of system events and states is common across monitors and constitutes the core of the framework. The audit phase of each monitor is implemented and operated independently. In addition, HyperTap relies on hardware invariants to provide a strongly isolated root of trust. HyperTap uses active monitoring, which can be adapted to enforce a wide spectrum of RnS policies. We validate Hy-perTap by introducing three example monitors: Guest OS Hang Detection (GOSHD), Hidden RootKit Detection (HRKD), and Privilege Escalation Detection (PED). Our experiments with fault injection and real rootkits/exploits demonstrate that HyperTapprovides robust monitoring with low performance overhead.

[3] G. Wang, Zachary Estrada, Cuong Pham, Zbigniew Klabarczyk, and Ravishankar Iyer, "Hypervisor Introspection: Exploiting Timing Side-Channels against VM Monitoring", 44th International Conference on Dependable Systems and Networks, June 2014.

Abstract: Hypervisor activity is designed to be hidden from guest Virtual Machines (VM) as well as external observers. In this paper, we demonstrate that this does not always occur. We present a method by which an external observer can learn sensitive information about hypervisor internals, such as VM scheduling r hypervisor-level monitoring schemes, by observing a VM. We refer to this capability as Hypervisor Introspection (HI).

[4] Phuong Cao, Eric Badger, Zbigniew Kalbarczyk, Ravishankar Iyer, Alexander Withers and Adam Slagell, "Towards an Unified Security Testbed and Security Analytics Framework", Symposium and Bootcamp for the Science of Security (HotSoS 2015), Urbana, IL, April 20-21, 2015.

Abstract: This paper presents the architecture of an end-to-end security testbed and security analytics framework, which aims to: i) understand real-world exploitation of known security vulnerabilities and ii) preemptively detect multi-stage at-tacks, i.e., before the system misuse. With the increasing number of security vulnerabilities, it is necessary for security researchers and practitioners to understand: i) system and network behaviors under attacks and ii) potential effects of attacks to the target infrastructure. To safely emulate and instrument exploits of known vulnerabilities, we use virtualization techniques to isolate attacks in containers, e.g., Linux-based containers or Virtual Machines, and to deploy monitors, e.g., kernel probes or network packet captures, across a system and network stack. To infer the evolution of attack stages from monitoring data, we use a probabilistic graphical model, namely AttackTagger, which represents learned knowledge of simulated attacks in our security testbed and real-world attacks. Experiments are being run on a real-world deployment of the framework at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.

[5] Phuong Cao, Eric Badger, Zbigniew Kalbarczyk, Ravishankar Iyer and Adam Slagell, "Preemptive Intrusion Detection: Theoretical Framework and Real-World Measurements", Symposium and Bootcamp for the Science of Security (HotSoS 2015), Urbana, IL, April 20-21, 2015.

Abstract: This paper presents a Factor Graph based framework called AttackTagger for highly accurate and preemptive detection of attacks, i.e. before the system misuse. We use security logs on real incidents that occurred over a six-year period at the National Center for Supercomputing Applications (NCSA) to evaluate AttackTagger. Our data consist of security incidents that led to compromise of the target system, i.e., the attacks in the incidents were only identified after the fact by security analysts. AttackTagger detected 74 percent of attacks, and the majority them were detected before the system misuse. Finally, AttackTagger uncovered six hidden attacks that were not detected by intrusion detection systems during the incidents or by security analysts in post-incident forensic analysis.

[6] Z.J. Estrada, C. Pham, Z. Kalbarczyk, R.K. Iyer, and L. Yan, "Dynamic VM Dependability Monitoring Using Hypervisor Probes", 11th European Dependable Computing Conference-Dependability in Practice (EDCC 2015), Paris, France, September 7-11, 2015.

Abstract: Many current VM monitoring approaches require guest OS modifications and are also unable to perform application level monitoring, reducing their value in a cloud setting. This paper introduces hprobes, a framework that allows one to dynamically monitor applications and operating systems inside a VM. The hprobe framework does not require any changes to the guest OS, which avoids the tight coupling of monitoring with its target. Furthermore, the monitors can be customized and enabled/disabled while the VM is running. To demonstrate the usefulness of this framework, we present three sample detectors: an emergency detector for a security vulnerability, an application watchdog, and an infinite-loop detector. We test our detectors on real applications and demonstrate that those detectors achieve an acceptable level of performance overhead with a high degree of flexibility.

[7] Key-whan Chung, Charles A. Kamhoua, Kevin A. Kwiat, Zbigniew T. Kalbarczyk and Ravishankar K. Iyer, "Game Theory with Learning for Cyber Security Monitoring", IEEE High Assurance Systems Engineering Symposium (HASE 2016), Orlando, FL, January 7-9, 2016.

Abstract: Recent attacks show that threats to cyber infrastructure are not only increasing in volume, but are getting more sophisticated. The attacks may comprise multiple actions that are hard to differentiate from benign activity, and therefore common detection techniques have to deal with high false positive rates. Because of the imperfect performance of automated detection techniques, responses to such attacks are highly dependent on human-driven decision-making processes. While game theory has been applied to many problems that require rational decision-making, we find limitation on applying such method on security games. In this work, we propose Q-Learning to react automatically to the adversarial behavior of a suspicious user to secure the system. This work compares variations of Q-Learning with a traditional stochastic game. Simulation results show the possibility of Naive Q-Learning, despite restricted information on opponents.

[8] Phuong Cao, Eric Badger, Zbigniew Kalbarczyk, and Ravishankar Iyer, "A Framework for Generation, Replay and Analysis of Real-World Attack Variants", Symposium and Bootcamp on the Science of Security (HotSoS 2016), Pittsburgh, PA, April 20-21, 2016.

Abstract: This paper presents a framework for (1) generating variants of known attacks, (2) replaying attack variants in an isolated environment and, (3) validating detection capabilities of attack detection techniques against the variants. Our framework facilitates reproducible security experiments. We generated 648 variants of three real-world attacks (observed at the National Center for Supercomputing Applications at the University of Illinois). Our experiment showed the value of generating attack variants by quantifying the detection capabilities of three detection methods: a signature-based detection technique, an anomaly-based detection technique, and a probabilistic graphical model-based technique.

[9] Hui Lin, Homa Alemzadeh, Daniel Chen, Zbigniew Kalbarczyk, and Ravishankar K. Iyer, "Safety-critical Cyber-physical Attacks: Analysis, Detection, and Mitigation", Symposium and Bootcamp on the Science of Security (HotSoS 2016), Pittsburgh, PA, April 20-21, 2016.

Abstract: Today's cyber-physical systems (CPSs) can have very different characteristics in terms of control algorithms, configurations, underlying infrastructure, communication protocols, and real-time requirements. Despite these variations, they all face the threat of malicious attacks that exploit the vulnerabilities in the cyber domain as footholds to introduce safety violations in the physical processes. In this paper, we focus on a class of attacks that impact the physical processes without introducing anomalies in the cyber domain. We present the common challenge sin detecting this type of attacks in the contexts of two very different CPSs (i.e., power grids and surgical robots). In addition, we present a general principle for detecting such cyber-physical attacks, which combine the knowledge of both cyber and physical domains to estimate the adverse consequences of malicious activities in a timely manner.

ACCOMPLISHMENT HIGHLIGHTS

Analysis of new credential stealing and infrastructure abuse attacks targeting Blue Waters, a petascale supercomputer hosted at NCSA. Data on these attacks were used to update our factor graph model. Specifically, we collected new set of security-related events that we used to create new factor functions in the factor graph model employed by the AttackTagger so to cover these recent attacks.

Evaluation of detection capabilities of random factor functions (functions that return a random value when invoked in the factor graph evaluation) and their comparison with the detection capabilities of factor functions defined based on the system knowledge and insights from the security experts (tested in an earlier stage of this work). The goal was to assess how effective are factor functions defined by human. We found that detection performance of random factor functions has a consistently lower performance as compared to factor functions defined using system and domain expertise. We used 648 unique attack variants generated by our framework to conduct the comparison. The results in Table 1 show that using random factor function the AttackTagger effectiveness drops severely when using random factor functions, e.g., from 75% to 3.5% for the Attack 1.

Attack 1 Attack 2 Attack 3
AttackTagger: human defined factor functions 108/144 (75%) 108/216 (50%) 186/255 (65%)
AttackTagger: ramdon factor functions 5/144 (3.5%) 72/216 (33%) 110/288 (38%)
Total number of variants 144 216 288

Table 1: Attack variant detection results

Propose methods to automatically learn factor graph functions from past data (in addition to manually defined factor functions as reported earlier).

To explore applications of Factor Graphs in other domains, we are developing methods to automatically learn factor functions from past data, in order to provide high-quality factor functions in addition to ones defined manually. The factor functions reflect domain knowledge of security experts that is encoded into the intrusion detection model, i.e., the factor graph. However, defining factor functions is a manual process performed by a practitioner, which is time-consuming and not scalable for large-scale systems.

The proposed technique works as follows. First, security data of users involved in past incidents is collected in an incident database. Then attack patterns are extracted from individual incidents and group of incidents. An attack pattern is a sequence of important security alerts or events related to a user. Two methods are used to extract patterns: i) n-gram analysis for extracting consecutive events in individual incidents and ii) longest common sub-sequence analysis for extracting common events in a group of incidents. To assist human in modeling security incidents, factor functions are automatically generated to describe the extracted attack patterns. In our work, we consider two primitive types of factor function: bivariate factor function and multivariate factor functions.

The automation process uses a template library of bivariate and multivariate factor functions. In this library, all possible factor functions are generated and stored by enumerating possible input parameters. Since we have a finite set of events and user states, the number of bivariate factor functions are finite - in the order of thousands factor functions. The main procedure works as follows. For each extracted attack pattern, the event in the pattern and the corresponding label is matched with the inputs of a bivariate factor function in the library. Then, the matched bivariate factor function is extracted from the library and presented to a human. Using extracted factor functions, each function is assigned a weight and presented to security experts in a decreasing order of the weights. A weight of a function is the count that the function has been matched using an incident database. Using count as a measure, factor functions that are frequently found in attacks are assigned a higher weight.

However, this procedure is not without pitfalls. We observed factor functions that have abnormal high weights in attacks that involves repetitive events, such as denial of service or brute-force attacks. Thus, the factor functions are ranked by weight and presented to security experts for a final judgment and selection. Further work will focus on improving the proposed procedure.