Visible to the public Automated Synthesis Framework for Network Security and Resilience - July 2020Conflict Detection Enabled

PI: Matthew Caesar

Co-PI: Dong (Kevin) Jin

Researchers: Bingzhe Liu, Santhosh Prabhu, and Xiaoliang Wu

HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released November 2012.

This project is developing the analysis methodology needed to support scientific reasoning about the resilience and security of networks, with a particular focus on network control and information/data flow. The core of this vision is an automated synthesis framework (ASF), which will automatically derive network state and repairs, from a set of specified correctness requirements and security policies. ASF consists of a set of techniques for performing and integrating security and resilience analyses applied at different layers in a real-time and automated fashion. This project is building both theoretical underpinnings and a practical realization of Science of Security. The proposed project covers four hard problems: (1) resilient architectures (primary), (2) scalability and composability, (3) policy-governed secure collaboration, and (4) security-metrics-driven evaluation, design, development and deployment.

PUBLICATIONS
Papers written as a result of your research from the current quarter only.

  • None to report this quarter

KEY HIGHLIGHTS
Each effort should submit one or two specific highlights. Each item should include a paragraph or two along with a citation if available. Write as if for the general reader of IEEE S&P.
The purpose of the highlights is to give our immediate sponsors a body of evidence that the funding they are providing (in the framework of the SoS lablet model) is delivering results that "more than justify" the investment they are making.

In the current quarter, our project progress is centered on addressing SoS lablet hard problems primarily in resilient architecture. Key highlights are listed as follows.

  • We continued the transfer of our technology to industry through interactions with Veriflow and VMWare. Veriflow is a startup company commercializing verification technology that came out of this project's SoS lablet funding. This startup company employed over thirty people in the United States and has conducted multiple pilots and deployments across several industry sectors including within the US Department of Defense. In September 2019, Veriflow was sold to VMWare. Veriflow's core technology is slated to be introduced into VMWare's widely-used NSX platform, which comprises over 26% of the market. More information is available at www.veriflow.net or by Googling "Veriflow". Current collaborations target enhancement of our verification technology to operate on real-time traffic data, as well as on developing a "high-speed" variant of our approach that can perform verification and quickly answer queries on large environments while requiring only small footprints in terms of memory and CPU.
  • We have continued our collaborations with Boeing on constructing a resilient IoT platform for the battlefield. IoT is crucially important to modern battlefield environments, making it a ripe target for adversaries. We are exploring an approach that leverages deep learning to dynamically relocate drone-mounted access points to evade the adversary. As part of this work we have formulated a placement algorithm that leverages Model-Agnostic Machine Learning (MAML) to construct a machine learning algorithm resilient to an adversary attempting to disrupt the learning process. We are in process of constructing simulations of our approach and leveraging those results to tune and refine our designs. We have also made progress constructing a real drone implementation which we will use to evaluate our design, though this work has been somewhat delayed due to the lockdown. Most recently, we have been focusing on improving the attack resilience of our algorithms. We have developed new deep learning mechanisms that are resilient to data sets that are "constructed" by adversaries, and our early simulation results show come benefits to these approaches in practical settings.
  • We have continued our collaboration with AT&T, which operates one of the largest networks in the world, to customize and deploy our technology in their environments. AT&T faces some unique challenges which will require custom solutions for their environment. In particular, AT&T runs application-centric networks, composed of both traditional networking elements and application services. To perform verification and synthesis in these networks, we are developing extensions to verify application-level semantics across the network. The operator can specify end-to-end requirements on application behavior, which are analyzed across the entire network. For this to work, our system requires extensions to analyze application-level properties, as well as information about how each device or application component in the network manipulates data flow. Since our last report we have had weekly meetings with AT&T to build understanding of their challenges. We have formulated a set of invariants that characterize their operational challenges. Moving forward, we are developing a platform to translate invariants and network data into the Planning Domain Definition Language (PDDL). We believe these efforts will allow us to solve their challenges efficiency in a manner that leverages the capabilities of planning algorithms. When this is complete, we plan to work with AT&T to evaluate performance on their network. Based on their interest in our work, AT&T has agreed to share operational data and access to their network engineers. We have already worked with AT&T to create instrumentation to collect this amazing dataset from their network. We have also constructed an implementation of our solving technique in PDDL and tested our approach over several test cases. We are now moving forward with a full implementation of our design, targeting deployment in AT&T's network. We have a weekly phone call with AT&T where we get their feedback and learn about their challenges, which we are using to guide our design. In this most recent quarter, we have focused our efforts on a particular use case of automating capacity provisioning. To support this use case, we have incorporated an optimization framework into our platform, enabling the ability to construct optimal plans.
  • We continue to develop a simulation/emulation-based platform for cyber-physical system resilience and security evaluation. The platform combines physical computing and networking hardware for the cyber presence while allowing for offline simulation and computation of the physical world. To efficient synchronization between simulation and network emulation/hardware, we designed a distributed virtual time system with two synchronization modes. Periodic mode designed for repeating events like sensor devices, and dynamic mode for on-demand interrupt-based synchronization. Our distributed virtual time enables processes and their clocks to be paused, resumed, and dilated across embedded Linux devices through the use of hardware interrupts and a common kernel module. This work has been submitted to ACM Transactions on Modeling and Computer Simulation (TOMACS) and we received some constructive review comments with minor changes required. The leading student author, Christopher Hannon, successfully defended his Ph.D. thesis centered around this testbed for cyber-physical system resilience and security evaluation. He also received the College of Science Excellence in Dissertation Award as well as the Best Student Research Paper Award in Computer Science at IIT.
  • We continue to study the interdependence between the power system and the communication network with the goal of improving resilience in critical energy infrastructures. We are developing an accurate model to represent the interdependencies between the two systems so that one can construct realistic communication networks to meet specific power system requirements for planning and evaluation. The application we investigate is the distribution service restoration (DSR) problem. We model the restoration process as an optimization problem to restore user demand at the earliest possible time. We implemented the analytical model in Matlab and also developed a simulation model including the communication network layer for validation and evaluation. We also developed an algorithm to make the restoration process deadlock-free, i.e., prevent the restoration process covering all loads. We are conducting evaluation experiments both models and preparing a manuscript targeting the IEEE Transactions on Smart Grid.
  • We continued the exploration of self-healing network management to address the resilient architecture hard problem and application of the methods to applications in cyber-physical energy systems. We develop an optimization-based PMU network self-healing scheme, which quickly recovers network connectivity and restores power system observability. We also integrate a new module to compress the network updates to be installed on the switches to further reduce the recovery time. We develop a proof-of-concept system based on scenarios derived from the IEEE 30-bus and 118-bus systems. Our scheme successfully recovers the power system observability for all test cases with fast recovery time. For example, in the cases with 10% of compromised PDCs, the model computational time is less than 87.6 ms for the 30-bus cases and 318.5 ms for the 118-bus cases. This work has been submitted to the 2020 IEEE SmartGridComm conference.
  • We continue to investigate of automated synthesis of network control to preserve desired security policies and network invariants. Specific invariants include (i) reduction of reaction time to fix problems, (ii) avoidance if introduction of errors in the repair process, and (iii) prevention of vulnerabilities. We have thus far designed a list of approximately 30 important and useful invariants to showcase the functionality of our system as well as to test it in practical use. We are also exploring how to synthesize patches to automatically fix critical invariants that were violated by the network controller application. The candidate approach under consideration models both the forwarding behavior of data through the network, control operations conducted on the network, as well as operations between the two. We have collected historical network configurations from the University of Illinois campus network going back over 15 years. We have performed a study of our techniques on this dataset, results of which we are using to improve and tune our design and architecture. We have also begun two new initiatives to extend applicability of this work: (a) a mechanism to allow synthesis of the distributed control software found in modern cloud environments - here, we have performed an extensive analysis of real cloud vulnerabilities, and developed baseline algorithms to synthesize control parameters for cloud environments (b) an application of our techniques to wide-area - here, we have begun a collaboration with a security company (Censys) to perform wide-area verification of networked services, and have begun discussions on approaches to incorporate their data into our synthesis frameworks.

COMMUNITY ENGAGEMENTS

  • Matthew Caesar helped create and will serve as co-chair for an ACM SIGCOMM workshop on "Teaching and Learning Computer Networking During the Pandemic". The workshop will provide support to the many universities who suddenly had to move online during the pandemic, and the many students who are grappling and facing many new challenges with working online.
  • Matthew Caesar was selected to serve as the mentoring chair for ACM SIGCOMM 2021. As part of his duties, he is helping to design the conference to be the first "virtual" SIGCOMM conference ever held.
  • Matthew Caesar was selected to serve on the program committee for ACM CCS 2021, a top conference in computer security.
  • Matthew Caesar was selected to serve on the program committee for ACM NSDI 2021, a top conference in computer systems.
  • Matthew Caesar was selected as an Editor for IEEE/ACM Transactions on Networking.
  • Matthew Caesar continues to serve as Chief Science Officer of Veriflow, a company commercializing technology spun out of our Science of Security lablet work. Matthew has worked with Veriflow to undertake multiple new deployments of our earlier technology at top commercial-sector firms this quarter. The most recent news about Veriflow is available on the Veriflow web site (http://www.veriflow.net).
  • Matthew Caesar has continued an engagement with the University of Illinois Center for Digital Agriculture towards securing our nation's food supply. His work leverages machine learning to detect anomalies in supply-chain operations.
  • Matthew Caesar is working with Providence St. Joseph Health, a large hospital in Los Angeles, in developing an IoT-based home monitoring system for COVID19. The device will be secure and HIPAA-compliant, collect vital signs to make sure patients are doing ok and will store results in the cloud for later processing.

EDUCATIONAL ADVANCES

  • Christopher Hannon, a former Ph.D. student of Kevin Jin, graduated in May 2020, and started to work in CRCL GmbH in June 2020. Umar Farooq, an M.S. student of Matthew Caesar, graduated in May 2020, and will join Amazon, participating in the design of their cloud networking environments. Bella Lee, an M.S. student of Matthew Caesar, also graduated in May 2020, and will join Google.
  • Kevin Jin and Kyle Hale has developed a new graduate-level cyber security class "CSP544 System and Network Security" for Spring 2020 at Illinois Institute of Technology (IIT); and the TA, Gong Chen (one of Kevin's Ph.D. student) received the 2020 Best TA award in Computer Science at IIT.
  • Kevin organized a virtual Ph.D. colloquium as part of the ACM SIGSIM-PADS conference in June 2020. The Ph.D. colloquium included a keynote speech and multiple student presentations with 99 attendees. We applied and received the NSF student travel grant for the event. The grant has been extended to SIGSIM-PADS'2021 as the COVID-19 pandemic made this year's conference online.
  • Jiaqi Yan, a former Ph.D. student of Kevin Jin, graduated in Dec 2019, and started to work in Microsoft in Jan 2020.
  • Christopher Hannon, a Ph.D. student of Kevin Jin, received the College of Science Excellence in Dissertation Award at IIT.
  • Matthew Caesar was elected to become the Director of Education for ACM SIGCOMM. As part of his tenure, Matthew will work with universities across the United States to further rigorous education on cybersecurity.
    Kevin Jin served as the Director of the new Master of Cybersecurity Program in the College of Science at Illinois Institute of Technology (https://www.iit.edu/academics/programs/cybersecurity-mas). The program will serve as one more platform to disseminate the educational and research outcomes of our Science of Security project.
  • Matthew Caesar has created a new class on Internet of Things at UIUC. The class contains extensive coverage of security in this important domain. The class is slated for public release this fall on Coursera's Massive Online Open Course (MOOC) platform. The course will be open for enrollment by anyone, even people not attending the University of Illinois. Development of a beta version of this class has been completed and its inaugural enrollment is open now for Spring 2020. The class has filled to capacity.
  • Matthew Caesar also continues to refine his Networking Laboratory class, targeting release for Spring 2020. He has developed a new set of Cybersecurity lectures for his class, covering important topics, and educating students how to improve security of common networking deployments.
  • Matthew Caesar is currently constructing an online platform for working with IoT devices in the cloud. The platform virtualizes IoT devices, internally leveraging a new technology that extends virtual machines into the IoT domain. This work will probably take another year to develop, but when it is released, we hope to grow from small pilots to a platform that can allow students across the world to learn about and work with IoT security in a manner that greatly accelerates their ability to experiment and learn.