Data Driven Security Models and Analysis - April 2015
Public Audience
Purpose: To highlight project progress. Information is generally at a higher level which is accessible to the interested public. All information contained in the report (regions 1-3) is a Government Deliverable/CDRL.
PI(s): Ravi Iyer
Co-PI(s): Zbigniew Kalbarczyk and Adam Slagell
Researcher(s): Cuong Pham, Zachary Estrada, and Phuong Cao
HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.
- Predictive security metrics - design, development, and validation
- Resilient architectures - in the end we want to use the metrics to achieve a measurable enhancement in system resiliency, i.e., the ability to withstand attacks
- Human behavior - data contain traces of the steps the attacker took, and hence inherently include some aspects of the human behavior (of both users and miscreants)
PUBLICATIONS
Papers published in this quarter as a result of this research. Include title, author(s), venue published/presented, and a short description or abstract. Identify which hard problem(s) the publication addressed. Papers that have not yet been published should be reported in region 2 below.
No publications this quarter.
ACCOMPLISHMENT HIGHLIGHTS
This quarter we focused on building prototype of a real-time log aggregation and analytic architectures. We explored state of the art monitoring technologies and monitoring tools to incorporate into our framework. An example of technologies are out-of-VM monitoring using Hypertap that allows monitoring a guest VM system state from without running an agent inside the guest in order to reduce attack surface to monitoring tools. Example of monitoring tools are Kafka framework a multiple producer/consumer message bus that allows real-time, reliable log transportation from heterogeneous data sources such as syslog, Bro IDS log, netflows, etc.
To further enrich our attack detection model named AttackTagger, we built a prototype of an attack repository, which contains ready to use images of high-profile attacks for education and research purpose. Initially, we have create an image of the Heartbleed attack, an attack exploit the heartbeat extension in the OpenSSL library to remotely read a server memory. The monitoring traces from the attack repository are important feedbacks to improve our detection model.
This quarter we focused on building prototype of a real-time log aggregation and analytic architectures. We explored state of the art monitoring technologies and monitoring tools to incorporate into our framework. An example of technologies are out-of-VM monitoring using Hypertap that allows monitoring a guest VM system state from without running an agent inside the guest in order to reduce attack surface to monitoring tools. Example of monitoring tools are Kafka framework a multiple producer/consumer message bus that allows real-time, reliable log transportation from heterogeneous data sources such as syslog, Bro IDS log, netflows, etc.
To further enrich our attack detection model named AttackTagger, we built a prototype of an attack repository, which contains ready to use images of high-profile attacks for education and research purpose. Initially, we have create an image of the Heartbleed attack, an attack exploit the heartbeat extension in the OpenSSL library to remotely read a server memory. The monitoring traces from the attack repository are important feedbacks to improve our detection model.
Monitoring architecture. The National Center for Supercomputing Applications (NCSA) provides raw log data including syslog, netflow, and Bro Intrusion Detection System (IDS) \cite{bro2008homepage} logs gathered from over 5000 nodes and high speed networks. The logs also include activity and data centered on BlueWaters and the Extreme Science and Engineering Discovery Environment (XSEDE) at the University of Illinois at Urbana-Champaign, both of which cater to thousands of users.
Raw logs. The raw logs includes syslog, netflows, Bro IDS logs which focus on security related events such as authentication logs and IDS alerts.
Log processing. The aggregated logs need to be pre-processed in order to minimize the amount of information to analyze. The log streams forward their data to the log pre-processor, which: i) picks out only the pieces of data that are meaningful to our system and ii) puts them into a consistent format that is consumable by AttackTagger.
Log transferring is handled by Kafka is a high-throughput distributed messaging systemKafka comes with many attractive features for our use. First of all, Kafka is high-throughput and horizontally scalable. It is able to handle hundreds of megabytes of reads and writes per second from thousands of clients. In addition, Kafka offers strong durability and fault-tolerance guarantees. Because the system was designed to be distributed, it is seamless to add new machines without downtime and increase the capabilities of the system.
Security testbed. A security testbed is a controlled system and network environment where attacks can be replicated. Main requirements of a security testbed are: i) isolated to prevent an attack affecting production infrastructure, ii) instrumented to collect system and network events in various attack stages, and iii) repeatable and convenient to share among security researchers.
We propose a security testbed architecture based on two virtualization technologies: Virtual Machine Monitors (VMM) at the hardware-level and Linux Containers (LXC) at the operating-system-level to emulate multiple isolated systems on a single control host.