Visible to the public A Hypothesis Testing Framework for Network Security - July 2015Conflict Detection Enabled

Public Audience
Purpose: To highlight project progress. Information is generally at a higher level which is accessible to the interested public. All information contained in the report (regions 1-3) is a Government Deliverable/CDRL.

PI(s): P. Brighten Godfrey

Co-PI(s): Matthew Caesar, David Nicol, William Sanders, and Kevin Jin (Illinois Institute of Technology)

HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.

This project covers four hard problems:

  • Scalability and composability
  • Policy-governed secure collaboration
  • Predictive security metrics
  • Resilient architectures

PUBLICATIONS
Papers published in this quarter as a result of this research. Include title, author(s), venue published/presented, and a short description or abstract. Identify which hard problem(s) the publication addressed. Papers that have not yet been published should be reported in region 2 below.

[1] Jiaqi Yan and Dong Jin. "VT-Mininet: Virtual-time-enabled Mininet for Scalable and Accurate Software-Define Network Emulation." ACM SIGCOMM Symposium on SDN Research 2015 (SOSR15), Santa Clara, CA, June 2015.

source code available at https://github.com/littlepretty/VirtualTimeForMininet

Abstract: The advancement of software-defined networking (SDN) technology is highly dependent on the successful transformations from in-house research ideas to real-life products. To enable such transformations, a testbed offering scalable and high fidelity networking environment for testing and evaluating new/existing designs is extremely valuable. Mininet, the most popular SDN emulator by far, is designed to achieve both accuracy and scalability by running unmodified code of network applications in lightweight Linux Containers. However, Mininet cannot guarantee performance fidelity under high workloads, in particular when the number of concurrent active events is more than the number of parallel cores. In this project, we develop a lightweight virtual time system in Linux container and integrate the system with Mininet, so that all the containers have their own virtual clocks rather than using the physical system clock which reflects the serialized execution of multiple containers. With the notion of virtual time, all the containers perceive virtual time as if they run independently and concurrently. As a result, inter- actions between the containers and the physical system are artificially scaled, making a network appear to be ten times faster from the viewpoint of applications within the containers than it actually is. We also design an adaptive virtual time scheduling subsystem in Mininet, which is responsible to balance the experiment speed and fidelity. Experimental results demonstrate that embedding virtual time into Mininet significantly enhances its performance fidelity, and therefore, results in a useful platform for the SDN community to conduct scalable experiments with high fidelity.

[2] Jiaqi Yan and Dong Jin. "A Virtual Time System for Linux-container-based Emulation of Software-defined Networks." ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, London, UK, June 2015 (Finalist for the Best Paper Award)

Abstract: Realistic and scalable testing systems are critical to evaluate network applications and protocols to ensure successful real system deployments. Container-based network emulation is attractive because of the combination of many desired features of network simulators and physical testbeds. The success of Mininet, a popular software- defined networking (SDN) emulation testbed, demonstrates the value of such approach that we can execute unmodified binary code on a large- scale emulated network with lightweight OS-level virtualization techniques. However, an ordinary network emulator uses the system clock across all the containers even if a container is not being scheduled to run. This leads to the issue of temporal fidelity, especially with high workloads. Virtual time sheds the light on the issue of preserving temporal fidelity for large-scale emulation. The key insight is to trade time with system resources via precisely scaling the time of interactions between containers and physical devices by a factor of n, hence, making an emulated network appear to be n times faster from the viewpoints of applications in the container. In this paper, we develop a lightweight Linux-container-based virtual time system and integrate the system to Mininet for fidelity and scalability enhancement. We also design an adaptive time dilation scheduling module for balancing speed and accuracy. Experimental results demonstrate that (1) with virtual time, Mininet is able to accurately emulate a network n times larger in scale, where n is the scaling factor, with the system behaviors closely match data obtained from a physical testbed; and (2) with the adaptive time dilation scheduling, we reduce the running time by 46% with little accuracy loss. Finally, we present a case study using the virtual-time-enabled Mininet to evaluate the limitations of equal-cost multi-path (ECMP) routing in a data center network.

[3] Ning Liu, Adnan Haider, Xian-He Sun and Dong Jin. "FatTreeSim: Modeling a Large-scale Fat-Tree Network for HPC Systems and Data Centers Using Parallel and Discrete Event Simulation." ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, London, UK, June 2015. Best Paper Award

news available at http://www.iit.edu/news/iittoday/?p=45292

Abstract: Fat-tree topologies have been widely adopted as the communication network in data centers in the past decade. Nowadays, high-performance computing (HPC) system designers are considering using fat-tree as the interconnection network for the next generation supercomputers. For extreme-scale computing systems like the data centers and supercomputers, the performance is highly dependent on the interconnection networks. In this paper, we present FatTreeSim, a PDES-based toolkit consisting of a highly scalable fat-tree network model, with the goal of better understanding the design constraints of fat-tree networking architectures in data centers and HPC systems, as well as evaluating the applications running on top of the network. FatTreeSim is designed to model and simulate large-scale fat-tree networks up to millions of nodes with protocol-level fidelity. We have con- ducted extensive experiments to validate and demonstrate the accuracy, scalability and usability of FatTreeSim. On Argonne Leadership Computing Facility's Blue Gene/Q system, Mira, FatTreeSim is capable of achieving a peak event rate of 305 M/s for a 524,288-node fat-tree model with a total of 567 billion committed events. The strong scaling experiments use up to 32,768 cores and show a near linear scalability. Comparing with a small-scale physical system in Emulab, FatTreeSim can accurately model the latency in the same fat-tree network with less than 10% error rate for most cases. Finally, we demonstrate FatTreeSim's usability through a case study in which FatTreeSim serves as the network module of the YARNsim system, and the error rates for all test cases are less than 13.7%.

[4] Ning Liu, Xian-He Sun and Dong Jin. "On Massively Parallel Simulation of Large-Scale Fat-Tree Networks for HPC Systems and Data Centers," ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, London, UK, June 2015 (poster) Best Poster Award

[5] Anduo Wang, Fan Yang, Mangesh Bendre, Brighten Godfrey, and Matthew Caesar, "Ravel: Orchestrating Software-Defined Networks", software demo at ACM SIGCOMM Symposium on SDN Research (SOSR), June 2015.

ACCOMPLISHMENT HIGHLIGHTS

The increasingly complex, large-scale nature of networks makes it difficult for network architects, security officers, and operators to understand their own networks' behavior. Even very simple behaviors -- such as whether it is possible for any packet (however unusual) to flow between two devices -- are difficult for operators to test, and synthesizing these low-level behaviors into a high-level quantitative understanding of network security has been beyond reach. This project is developing the analysis methodology needed to support scientific reasoning about the security of networks, with a particular focus on information and data flow security. The core of this vision is Network Hypothesis Testing Methodology (NetHTM), a set of techniques for performing and integrating security analyses applied at different network layers, in different ways, to pose and rigorously answer quantitative hypotheses about the end-to-end security of a network. This project is building both theoretical underpinnings and a practical realization of Science of Security.