Automated Synthesis Framework for Network Security and Resilience - April 2020
PI: Matthew Caesar
Co-PI: Dong (Kevin) Jin
Researchers: Bingzhe Liu, Santhosh Prabhu, and Xiaoliang Wu
HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.
This project is developing the analysis methodology needed to support scientific reasoning about the resilience and security of networks, with a particular focus on network control and information/data flow. The core of this vision is an automated synthesis framework (ASF), which will automatically derive network state and repairs, from a set of specified correctness requirements and security policies. ASF consists of a set of techniques for performing and integrating security and resilience analyses applied at different layers in a real-time and automated fashion. This project is building both theoretical underpinnings and a practical realization of Science of Security. The proposed project covers four hard problems: (1) resilient architectures (primary), (2) scalability and composability, (3) policy-governed secure collaboration, and (4) security-metrics-driven evaluation, design, development and deployment.
PUBLICATIONS
Papers written as a result of your research from the current quarter only.
- Santhosh Prabhu, Kuan Yen Chou, Ali Kheradmand, Brighten Godfrey, Matthew Caesar, Plankton: Scalable Network Configuration Verification Through Model Checking, NSDI, February 2020
Abstract: Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning with explicit-state model checking, network configuration verification can be scaled significantly better than the state of the art, while still supporting a rich set of protocol features. We propose Plankton, which uses symbolic partitioning to manage large header spaces and efficient model checking to exhaustively explore protocol behavior. Thanks to a highly effective suite of optimizations including state hashing, partial order reduction, and policy-based pruning, Plankton successfully verifies policies in industrial-scale networks quickly and compactly, at times reaching a 10000x speedup compared to the state of the art.
Hard problem(s) addressed: resilient architectures
KEY HIGHLIGHTS
Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making.
In the current quarter, our project progress is centered on addressing SoS lablet hard problems primarily in resilient architecture. Key highlights are listed as follows.
- We continued the transfer of our technology to industry through interactions with Veriflow and VMWare. Veriflow is a startup company commercializing verification technology that came out of this project's SoS lablet funding. This startup company employed over thirty people in the United States and has conducted multiple pilots and deployments across several industry sectors including within the US Department of Defense. In September 2019, Veriflow was sold to VMWare. Veriflow's core technology is slated to be introduced into VMWare's widely-used NSX platform, which comprises over 26% of the market. More information is available at www.veriflow.net. Current collaborations target enhancement of our verification technology to operate on real-time traffic data.
- We continue to investigate of automated synthesis of network control to preserve desired security policies and network invariants. Specific invariants include (i) reduction of reaction time to fix problems, (ii) avoidance if introduction of errors in the repair process, and (iii) prevention of vulnerabilities. We have thus far designed a list of approximately 30 important and useful invariants to showcase the functionality of our system as well as to test it in practical use. We are also exploring how to synthesize patches to automatically fix critical invariants that were violated by the network controller application. The candidate approach under consideration models both the forwarding behavior of data through the network, control operations conducted on the network, as well as operations between the two. Since our last report, we have collected historical network configurations from the University of Illinois campus network going back over 15 years. We have performed a study of our techniques on this dataset, results of which we are using to improve and tune our design and architecture.
- We continued the exploration of self-healing network management to address the resilient architecture hard problem and application of the methods to applications in cyber-physical energy systems. We revisited the optimization formulation to include the rule compression process in order to save the total recovery time. We continue to conduct system evaluation with our proof-of-concept system using IEEE 14, 30 and 118 bus systems. Currently, we are preparing a manuscript describing this work.
- We continue to study the interdependence between the power system and the communication network with the goal of improving resilience in critical energy infrastructures. Our review paper on power grid resilience enhancement has been accepted by IET smart grid journal. We investigate the power system restoration problem and formulate an optimization model by considering the interdependencies among the two systems. We developed an algorithm to accurately model the communication system and the interdependency relationship in the system. The model includes the procedure of the device repair, electricity recovery, and communication flow in the restoration process. We generate the optimal restoration plan by solving a mixed integer linear programming problem.
- We continue to develop a simulation/emulation-based platform for cyber-physical system resilience and security evaluation. The platform combines physical computing and networking hardware for the cyber presence while allowing for offline simulation and computation of the physical world. To efficient synchronization between simulation and network emulation/hardware, we are exploring a hybrid virtual time scheme by combining a scheduling-based approach with our existing interrupt-based approach. This modification will enable a wider range of applications.
- We have continued our collaboration with AT&T, which operates one of the largest networks in the world, to customize and deploy our technology in their environments. AT&T faces some unique challenges which will require custom solutions for their environment. In particular, AT&T runs application-centric networks, composed of both traditional networking elements and application services. To perform verification and synthesis in these networks, we are developing extensions to verify application-level semantics across the network. The operator can specify end-to-end requirements on application behavior, which are analyzed across the entire network. For this to work, our system requires extensions to analyze application-level properties, as well as information about how each device or application component in the network manipulates data flow. Since our last report we have had weekly meetings with AT&T to build understanding of their challenges. We have formulated a set of invariants that characterize their operational challenges. Moving forward, we are developing a platform to translate invariants and network data into the Planning Domain Definition Language (PDDL). We believe these efforts will allow us to solve their challenges efficiency in a manner that leverages the capabilities of planning algorithms. When this is complete, we plan to work with AT&T to evaluate performance on their network. Based on their interest in our work, AT&T has agreed to share operational data and access to their network engineers. We have already worked with AT&T to create instrumentation to collect this amazing dataset from their network. We have also constructed an implementation of our solving technique in PDDL and tested our approach over several test cases. We have a weekly phone call with AT&T where we get their feedback and learn about their challenges, which we are using to guide our design.
- We have continued our collaborations with Boeing on constructing a resilient IoT platform for the battlefield. IoT is crucially important to modern battlefield environments, making it a ripe target for adversaries. We are exploring an approach that leverages deep learning to dynamically relocate drone-mounted access points to evade the adversary. As part of this work we have formulated a placement algorithm that leverages Model-Agnostic Machine Learning (MAML) to construct a machine learning algorithm resilient to an adversary attempting to disrupt the learning process. We are in process of constructing simulations of our approach, and leveraging those results to tune and refine our designs. We have also made progress constructing a real drone implementation which we will use to evaluate our design. We hope to proceed to a completed approach and construction of a drone-mounted testbed within the next three quarters.
COMMUNITY ENGAGEMENTS
- Matthew Caesar was selected to serve on the program committee for ACM CCS 2021, a top conference in computer security.
- Matthew Caesar was selected to serve on the program committee for ACM NSDI 2021, a top conference in computer systems.
- Matthew Caesar continues to serve as Chief Science Officer of Veriflow, a company commercializing technology spun out of our Science of Security lablet work. Matthew has worked with Veriflow to undertake multiple new deployments of our earlier technology at top commercial-sector firms this quarter. The most recent news about Veriflow is available on the Veriflow web site (http://www.veriflow.net).
- Matthew Caesar has continued an engagement with the University of Illinois Center for Digital Agriculture towards securing our nation's food supply. His work leverages machine learning to detect anomalies in supply-chain operations.
- Kevin Jin served as the general chair of the 2019 ACM SIGSIM-PADS Conference.
- Kevin Jin is serving as a track coordinator of the "Simulation and Cyber Security" track in the Winter Simulation Conference 2019. The track now consists two paper sessions and one panel with the topic "Simulation for Cyber Security Risk Management."
- Kevin Jin is serving as the web chair for the 2020 ACM SIGCOMM Symposium on SDN Research (SOSR)
EDUCATIONAL ADVANCES
- Matthew Caesar was elected to become the Director of Education for ACM SIGCOMM. As part of his tenure, Matthew will work with universities across the United States to further rigoros education on cybersecurity.
- Kevin Jin is serving as the Ph.D. colloquium chair of the 2020 ACM SIGSIM-PADS Conference and has submitted a student travel grant proposal to NSF in December 2019.
- Kevin Jin and Kyle Hale are developing a new graduate-level cyber security class "CSP544 System and Network Security" for Spring 2020 at Illinois Institute of Technology.
- Kevin Jin has been appointed as the Director of the new Master of Cybersecurity Program in the College of Science at Illinois Institute of Technology (https://science.iit.edu/programs/graduate/master-cybersecurity-mcybcode). The program will serve as one more platform to disseminate the educational and research outcomes of our Science of Security projects.
- Kevin Jin and Chen Chen (Argonne National Lab) are preparing a tutorial titled "Electric Power System Resilience" at the 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm) in October 2019.
- We organized a Ph.D. colloquium as part of the ACM SIGSIM-PADS conference in June 2019. The Ph.D. colloquium include a career panel, poster session, student presentations, and a meeting with editors. We received 20+ submissions and 15 students were selected to present their work, among which 5 US-based Ph.D. students received the NSF student travel grant. The event has provided mentoring and educational opportunities to the young researchers, thus contributing to equipping them with tools that support their career success.
- Kevin Jin gave a full-day tutorial on "Cyber Security and Resilience of Cyber-Physical Systems" in the Internet of Things (IoT) Systems Research Center at the University of Wisconsin Madison, June 2019
- Matthew Caesar has created a new class on Internet of Things at UIUC. The class contains extensive coverage of security in this important domain. The class is slated for public release this fall on Coursera's Massive Online Open Course (MOOC) platform. The course will be open for enrollment by anyone, even people not attending the University of Illinois. Development of a beta version of this class has been completed and its inaugural enrollment is open now for Spring 2020. The class has filled to capacity.
- Matthew Caesar also continues to refine his Networking Laboratory class, targeting release for Spring 2020. He has developed a new set of Cybersecurity lectures for his class, covering important topics, and educating students how to improve security of common networking deployments.
- Matthew Caesar is currently constructing an online platform for working with IoT devices in the cloud. The platform virtualizes IoT devices, internally leveraging a new technology that extends virtual machines into the IoT domain. This work will probably take another year to develop, but when it is released, we hope to grow from small pilots to a platform that can allow students across the world to learn about and work with IoT security in a manner that greatly accelerates their ability to experiment and learn.