Biblio
Abstract—Network intrusion detection systems (NIDS) are essential security building-blocks for today’s organizations to ensure safe and trusted communication of information. In this paper, we study the feasibility of off-line deep learning based NIDSes by constructing the detection engine with multiple advanced deep learning models and conducting a quantitative and comparative evaluation of those models. We first introduce the general deep learning methodology and its potential implication on the network intrusion detection problem. We then review multiple machine learning solutions to two network intrusion detection tasks (NSL-KDD and UNSW-NB15 datasets). We develop a TensorFlow-based deep learning library, called NetLearner, and implement a handful of cutting-edge deep learning models for NIDS. Finally, we conduct a quantitative and comparative performance evaluation of those models using NetLearner.
Software-defined networking (SDN) enables efficient networkmanagement. As the technology matures, utilities are looking to integrate those benefits to their operations technology (OT) networks. To help the community to better understand and evaluate the effects of such integration, we develop DSSnet, a testing platform that combines a power distribution system simulator and an SDN-based network emulator for smart grid planning and evaluation. DSSnet relies on a container-based virtual time system to achieve efficient synchronization between the simulation and emulation systems. To enhance the system scalability and usability, we extend DSSnet to support a distributed controller environment. To enhance system fidelity, we extend the virtual time system to support kernel-based switches. We also evaluate the system performance of DSSnet and demonstrate the usability of DSSnet with a resilient demand response application case study.
Due to the evolution of programming languages, interpreted languages have gained widespread use in scientific and research computing. Interpreted languages excel at being portable, easy to use, and fast in prototyping than their ahead-of-time (AOT) counterparts, including C, C++, and Fortran. While traditionally considered as slow to execute, advancements in Just-in-Time (JIT) compilation techniques have significantly improved the execution speed of interpreted languages and in some cases outperformed AOT languages. In this paper, we explore some challenges and design strategies in developing a high performance parallel discrete event simulation engine, called Simian, written with interpreted languages with JIT capabilities, including Python, Lua, and Javascript. Our results show that Simian with JIT performs similarly to AOT simulators, such as MiniSSF and ROSS. We expect that with features like good performance, userfriendliness, and portability, the just-in-time parallel simulation will become a common choice for modeling and simulation in the near future.
Many of the emerging wide-area monitoring protection and control (WAMPAC) applications in modern electrical grids rely heavily on the availability and integrity of widespread phasor measurement unit (PMU) data. Therefore, it is critical to protect PMU networks against growing cyber-attacks and system faults. In this paper, we present a self-healing PMU network design that considers both power system observability and communication network characteristics. Our design utilizes centralized network control, such as the emerging software-defined networking (SDN) technology, to design resilient network self-healing algorithms against cyber-attacks. Upon detection of a cyber-attack, the PMU network can reconfigure itself to isolate compromised devices and re-route measurement
data with the goal of preserving the power system observability. We have developed a proof-of-concept system in a container-based network testbed using integer linear programming to solve a graphbased PMU system model.We also evaluate the system performance regarding the self-healing plan generation and installation using the IEEE 30-bus system.
As parallel and distributed systems are evolving toward extreme scale, for example, high-performance computing systems involve millions of cores and billion-way parallelism, and high- capacity storage systems require efficient access to petabyte or exabyte of data, many new challenges are posed on designing and deploying next-generation interconnection communication networks in these systems. Fat-tree networks have been widely used in both data centers and high-performance computing (HPC) systems in the past decades and are promising candidates of the next-generation extreme-scale networks. In this article, we present FatTreeSim, a simulation framework that supports modeling and simulation of extreme-scale fattree networks with the goal of understanding the design constraints of next-generation HPC and distributed systems and aiding the design and performance optimization of the applications running on these systems. We have systematically experimented FatTreeSim on Emulab and Blue Gene/Q and analyzed the scalability and fidelity of FatTreeSim with various network configurations. On the Blue Gene/Q Mira, FatTreeSim can achieve a peak performance of 305 million events per second using 16,384 cores. Finally, we have applied FatTreeSim to simulate several large-scale Hadoop YARN applications to demonstrate its usability.
Software defined networking is a rapidly expanding networking paradigm that aims to separate the control logic from the forwarding devices. Through centralized control, network operators are able to deploy and manage more efficient forwarding strategies. Traditionally, when the network undergoes a change through maintenance, failure, or cyber attack, the centralized controller processes these events and deploys new forwarding rules reactively. This work provides a strategy that does not require a controller in order to maintain connectivity while only using features within the existing OpenFlow protocol version 1.3 or greater. In this paper we illustrate why forwarding resiliency is desired in OpenFlow networks and provide an algorithm that computes the flow entries required to achieve maximal forwarding resiliency in presence of both multiple link and controller failures on any arbitrary network.
Software-defined networking (SDN) technology promises centralized and rapid network provisioning, holistic management, low operational cost, and improved network visibility. Researchers have developed multiple SDN simulation and emulation platforms to expedite the adoption of many emerging SDN-based applications to production systems. However, the scalability of those platforms is often limited by the underlying physical hardware resources, which inevitably affects the simulation delity in large-scale network settings. In this paper, we present a model abstraction technique that e ectively transforms the network devices in an SDN-based network to one virtualized switch model. While signi cantly reducing the model execution time and enabling the real-time simulation capability, our abstracted model also preserves the end-to-end forwarding behavior of the original network. To achieve this, we first classify packets with the same forwarding behavior into smaller and disjoint Equivalence Classes (ECes) by analyzing the OpenFlow rules installed on the SDN devices. We then create a graph model representing the forwarding behavior of each EC. By traversing those graphs, we nally construct the rules of the big-switch model to e ectively preserve the original network's end-to-end forwarding behavior. Experimental results demonstrate that the network forwarding logic equivalence is well preserved between the abstracted model and the original SDN network. The model abstraction process is fast, e.g., 3.15 seconds to transform a medium-scale tree network consisting of 53,260 rules. The big-switch model is able to speed up the simulation by 4.3 times in average and up to 6.69 times among our evaluation experiments.
To build a resilient and secure microgrid in the face of growing cyber-attacks and cyber-mistakes, we present a software-defined networking (SDN)-based communication network architecture for microgrid operations. We leverage the global visibility, direct networking controllability, and programmability offered by SDN to investigate multiple security applications, including self-healing communication network management, real-time and uncertainty-aware communication network verification, and specification-based intrusion detection. We also expand a novel cyber-physical testing and evaluation platform that combines a power distribution system simulator (for microgrid energy services) and an SDN emulator with a distributed control environment (for microgrid communications). Experimental results demonstrate that the SDN-based communication architecture and applications can significantly enhance the resilience and security of microgrid operations against the realization of various cyber threats.
Presented at NSA SoS Quarterly Meeting, July 2016 and November 2016
Modern industrial control systems (ICSes) are increasingly adopting Internet technology to boost control efficiency, which unfortunately opens up a new frontier for cyber-security. People have typically applied existing Internet security techniques, such as firewalls, or anti-virus or anti-spyware software. However, those security solutions can only provide fine-grained protection at single devices. To address this, we design a novel software-defined networking (SDN) architecture that offers the global visibility of a control network infrastructure, and we investigate innovative SDN-based applications with the focus of ICS security, such as network verification and self-healing phasor measurement unit (PMU) networks. We are also conducting rigorous evaluation using the IIT campus microgrid as well as a high-fidelity testbed combining network emulation and power system simulation.
Presented at the Illinois ITI Trust and Security/Science of Security Seminar, March 15, 2016.
Best Poster Award, Workshop on Science of Security through Software-Defined Networking, Chicago, IL, June 16-17, 2016.
The successful operations of modern power grids are highly dependent on a reliable and ecient underlying communication network. Researchers and utilities have started to explore the opportunities and challenges of applying the emerging software-de ned networking (SDN) technology to enhance eciency and resilience of the Smart Grid. This trend calls for a simulation-based platform that provides sufcient exibility and controllability for evaluating network application designs, and facilitating the transitions from inhouse research ideas to real productions. In this paper, we present DSSnet, a hybrid testing platform that combines a power distribution system simulator with an SDN emulator to support high delity analysis of communication network applications and their impacts on the power systems. Our contributions lay in the design of a virtual time system with the tight controllability on the execution of the emulation system, i.e., pausing and resuming any speci ed container processes in the perception of their own virtual clocks, with little overhead scaling to 500 emulated hosts with an average of 70 ms overhead; and also lay in the ecient synchronization of the two sub-systems based on the virtual time. We evaluate the system performance of DSSnet, and also demonstrate the usability through a case study by evaluating a load shifting algorithm.
Container-based network emulation offers high fidelity and a scalable testing environment to bridge the gap between research ideas and real-world network applications. However, containers take their notions of time from the physical system clock, and thus the time-stamped events from different containers are multiplexed to reflect the scheduling serialization by the Linux operating system. Conjoining the emulator and other simulators is also challenging due to the difficulties of synchronizing the virtual simulation clock with the physical system clock. Virtual time systems for network emulation shed light on both issues. In this paper, we develop a lightweight container-based virtual time system in Linux Kernel. We use time dilation to trade time with system resources by precisely scaling the time of interactions between containers and physical devices. We develop a time freezer to enable the precise pause and resume of an emulation experiment, which offers the virtual time support to interface with simulators for close synchronization. We integrate the virtual time system into a software-defined networking emulator, Mininet, and evaluate the system accuracy, scalability, and overhead. Finally, we use the virtual-time-enabled emulation testbed to conduct a case study of equal-cost multi-path routing protocol analysis in a data center network.
We present ConVenus, a system that performs rapid congestion verification of network updates in softwaredefined networks. ConVenus is a lightweight middleware between the SDN controller and network devices, and is capable to intercept flow updates from the controller and verify whether the amount of traffic in any links and switches exceeds the desired capacity. To enable online verification, ConVenus dynamically identifies the minimum set of flows and switches that are affected by each flow update, and creates a compact network model. ConVenus uses a four-phase simulation algorithm to quickly compute the throughput of every flow in the network model and report network congestion. The experimental results demonstrate that ConVenus manages to verify 90% of the updates in a network consisting of over 500 hosts and 80 switches within 5 milliseconds.
It is critical to ensure that network policy remains consistent during state transitions. However, existing techniques impose a high cost in update delay, and/or FIB space. We propose the Customizable Consistency Generator (CCG), a fast and generic framework to support customizable consistency policies during network updates. CCG effectively reduces the task of synthesizing an update plan under the constraint of a given consistency policy to a verification problem, by checking whether an update can safely be installed in the network at a particular time, and greedily processing network state transitions to heuristically minimize transition delay. We show a large class of consistency policies are guaranteed by this greedy heuristic alone; in addition, CCG makes judicious use of existing heavier-weight network update mechanisms to provide guarantees when necessary. As such, CCG nearly achieves the “best of both worlds”: the efficiency of simply passing through updates in most cases, with the consistency guarantees of more heavyweight techniques. Mininet and physical testbed evaluations demonstrate CCG’s capability to achieve various types of consistency, such as path and bandwidth properties, with zero switch memory overhead and up to a 3× delay reduction compared to previous solutions.
Presented at the Illinois SoS Bi-Weekly Meeting, February 2015.
We rely on network infrastructure to deliver critical services and ensure security. Yet networks today have reached a level of complexity that is far beyond our ability to have confidence in their correct behavior – resulting in significant time investment and security vulnerabilities that can cost millions of dollars, or worse. Motivated by this need for rigorous understanding of complex networks, I will give an overview of our or Science of Security lablet project, A Hypothesis Testing Framework for Network Security.
First, I will discuss the emerging field of network verification, which transforms network security by rigorously checking that intended behavior is correctly realized across the live running network. Our research developed a technique called data plane verification, which has discovered problems in operational environments and can verify hypotheses and security policies with millisecond-level latency in dynamic networks. In just a few years, data plane verification has moved from early research prototypes to production deployment. We have built on this technique to reason about hypotheses even under the temporal uncertainty inherent in a large distributed network. Second, I will discuss a new approach to reasoning about networks as databases that we can query to determine answers to behavioral questions and to actively control the network. This talk will span work by a large group of folks, including Anduo Wang, Wenxu an Zhou, Dong Jin, Jason Croft, Matthew Caesar, Ahmed Khurshid, and Xuan Zou.
Presented at the Illinois ITI Joint Trust and Security/Science of Security Seminar, September 15, 2015.
Best Poster Award, ACM SIGCOMM Conference on Principles of Advanced Discrete Simulation, London, UK, June 10-12, 2015.
The emerging software-defined networking (SDN) technology decouples the control plane from the data plane in a computer network with open and standardized interfaces, and hence opens up the network designers’ options and ability to innovate. The wide adoption of SDN in industry has motivated the development of large-scale, high-fidelity testbeds for evaluation of systems that incorporate SDN. In this article, we develop a framework to support OpenFlow-based SDN simulation and distributed emulation, by leveraging our prior work on a hybrid network testbed with a parallel network simulator and a virtual-machine-based emulation system. We show how to exploit typical SDN controller behaviors to handle performance issues caused by the centralized controller in parallel discrete-event simulation. In particular, we develop an asynchronous synchronization algorithm for passive SDN controllers and design a two-level architecture for active SDN controllers. We evaluate the system performance, showing good scalability. Finally, we present a case study, using the testbed, to evaluate network verification applications in an SDN-based data center network. CCS Concepts: Networks→Network simulations; Computing methodologies→Simulation
The advancement of software-defined networking (SDN) technology is highly dependent on the successful transformations from in-house research ideas to real-life products. To enable such transformations, a testbed offering scalable and high fidelity networking environment for testing and evaluating new/existing designs is extremely valuable. Mininet, the most popular SDN emulator by far, is designed to achieve both accuracy and scalability by running unmodified code of network applications in lightweight Linux Containers. How- ever, Mininet cannot guarantee performance fidelity under high workloads, in particular when the number of concurrent active events is more than the number of parallel cores. In this project, we develop a lightweight virtual time system in Linux container and integrate the system with Mininet, so that all the containers have their own virtual clocks rather than using the physical system clock which reflects the se- rialized execution of multiple containers. With the notion of virtual time, all the containers perceive virtual time as if they run independently and concurrently. As a result, inter- actions between the containers and the physical system are artificially scaled, making a network appear to be ten times faster from the viewpoint of applications within the contain- ers than it actually is. We also design an adaptive virtual time scheduling subsystem in Mininet, which is responsible to balance the experiment speed and fidelity. Experimen- tal results demonstrate that embedding virtual time into Mininet significantly enhances its performance fidelity, and therefore, results in a useful platform for the SDN community to conduct scalable experiments with high fidelity.
It is critical to ensure that network policy remains consistent during state transitions. However, existing techniques impose a high cost in update delay, and/or FIB space. We propose the Customizable Consistency Generator (CCG), a fast and generic framework to support customizable consistency policies during network updates. CCG effectively reduces the task of synthesizing an update plan under the constraint of a given consistency policy to a verification problem, by checking whether an update can safely be installed in the network at a particular time, and greedily processing network state transitions to heuristically minimize transition delay. We show a large class of consistency policies are guaranteed by this greedy jeuristic alone; in addition, CCG makes judicious use of existing heavier-weight network update mechanisms to provide guarantees when necessary. As such, CCG nearly achieves the “best of both worlds”: the efficiency of simply passing through updates in most cases, with the consistency guarantees of more heavyweight techniques. Mininet and physical testbed evaluations demonstrate CCG’s capability to achieve various types of consistency, such as path and bandwidth properties, with zero switch memory overhead and up to a 3× delay reduction compared to previous solutions.
Fat-tree topologies have been widely adopted as the communication network in data centers in the past decade. Nowa- days, high-performance computing (HPC) system designers are considering using fat-tree as the interconnection network for the next generation supercomputers. For extreme-scale computing systems like the data centers and supercomput- ers, the performance is highly dependent on the intercon- nection networks. In this paper, we present FatTreeSim, a PDES-based toolkit consisting of a highly scalable fat-tree network model, with the goal of better understanding the de- sign constraints of fat-tree networking architectures in data centers and HPC systems, as well as evaluating the applica- tions running on top of the network. FatTreeSim is designed to model and simulate large-scale fat-tree networks up to millions of nodes with protocol-level fidelity. We have con- ducted extensive experiments to validate and demonstrate the accuracy, scalability and usability of FatTreeSim. On Argonne Leadership Computing Facility’s Blue Gene/Q sys- tem, Mira, FatTreeSim is capable of achieving a peak event rate of 305 M/s for a 524,288-node fat-tree model with a total of 567 billion committed events. The strong scaling experiments use up to 32,768 cores and show a near linear scalability. Comparing with a small-scale physical system in Emulab, FatTreeSim can accurately model the latency in the same fat-tree network with less than 10% error rate for most cases. Finally, we demonstrate FatTreeSim’s usability through a case study in which FatTreeSim serves as the net- work module of the YARNsim system, and the error rates for all test cases are less than 13.7%.
Best Paper Award
Realistic and scalable testing systems are critical to evaluate network applications and protocols to ensure successful real system deployments. Container-based network emula- tion is attractive because of the combination of many desired features of network simulators and physical testbeds . The success of Mininet, a popular software- defined networking (SDN) emulation testbed, demonstrates the value of such approach that we can execute unmodified binary code on a large- scale emulated network with lightweight OS-level vir- tualization techniques. However, an ordinary network em- ulator uses the system clock across all the containers even if a container is not being scheduled to run. This leads to the issue of temporal fidelity, especially with high workloads. Virtual time sheds the light on the issue of preserving tem- poral fidelity for large-scale emulation. The key insight is to trade time with system resources via precisely scaling the time of interactions between containers and physical devices by a factor of n, hence, making an emulated network ap- pear to be n times faster from the viewpoints of applications in the container. In this paper, we develop a lightweight Linux-container-based virtual time system and integrate the system to Mininet for fidelity and scalability enhancement. We also design an adaptive time dilation scheduling mod- ule for balancing speed and accuracy. Experimental results demonstrate that (1) with virtual time, Mininet is able to accurately emulate a network n times larger in scale, where n is the scaling factor, with the system behaviors closely match data obtained from a physical testbed; and (2) with the adaptive time dilation scheduling, we reduce the running time by 46% with little accuracy loss. Finally, we present a case study using the virtual-time-enabled Mininet to evalu- ate the limitations of equal-cost multi-path (ECMP) routing in a data center network.
Today’s quality of life is highly dependent on the successful operation of many large-scale industrial control systems. To enhance their protection against cyber-attacks and operational errors, we develop a simulation-based verification framework with cross-layer verification techniques that allow comprehensive analysis of the entire ICS-specific stack, including application, protocol, and network layers.
Work in progress paper.