Biblio
Network operators face a challenge of ensuring correctness as networks grow more complex, in terms of scale and increasingly in terms of diversity of software components. Network-wide verification approaches can spot errors, but assume a simplified abstraction of the functionality of individual network devices, which may deviate from the real implementation. In this paper, we propose a technique for high-coverage testing of end-to-end network correctness using the real software that is deployed in these networks. Our design is effectively a hybrid, using an explicit-state model checker to explore all network-wide execution paths and event orderings, but executing real software as subroutines for each device. We show that this approach can detect correctness issues that would be missed both by existing verification and testing approaches, and a prototype implementation suggests the technique can scale to larger networks
with reasonable performance.
Recent years have seen significant advancement in the field of formal network verification. Tools have been proposed for offline data plane verification, real-time data plane verification and configuration verification under arbitrary, but static sets of failures. However, due to the fundamental limitation of not treating the network as an evolving system, current verification platforms have significant constraints in terms of scope. In real-world networks, correctness policies may be violated only through a particular combination of environment events and protocol actions, possibly in a non-deterministic sequence. Moreover, correctness specifications themselves may often correlate multiple data plane states, particularly when dynamic data plane elements are present. Tools in existence today are not capable of reasoning about all the possible network events, and all the subsequent execution paths that are enabled by those events. We propose Plankton, a verification platform for identifying undesirable evolutions of networks. By combining symbolic modeling of data plane and control plane with explicit state exploration, Plankton
performs a goal-directed search on a finite-state transition system that captures the behavior of the network as well as the various events that can influence it. In this way, Plankton can automatically find policy violations that can occur due to a sequence of network events, starting from the current state. Initial experiments have successfully predicted scenarios like BGP Wedgies.
Enterprise networks today have highly diverse correctness requirements and relatively common performance objectives. As a result, preferred abstractions for enterprise networks are those which allow matching correctness specification, while transparently managing performance. Existing SDN network management architectures, however, bundle correctness and performance as a single abstraction. We argue that this creates an SDN ecosystem that is unnecessarily hard to build, maintain and evolve. We advocate a separation of the diverse correctness abstractions from generic performance optimization, to enable easier evolution of SDN controllers and platforms. We propose Oreo, a first step towards a common and relatively transparent performance optimization layer for SDN. Oreo performs the optimization by first building a model that describes every flow in the network, and then performing network-wide, multi-objective optimization based on this model without disrupting higher level correctness.
It is critical to ensure that network policy remains consistent during state transitions. However, existing techniques impose a high cost in update delay, and/or FIB space. We propose the Customizable Consistency Generator (CCG), a fast and generic framework to support customizable consistency policies during network updates. CCG effectively reduces the task of synthesizing an update plan under the constraint of a given consistency policy to a verification problem, by checking whether an update can safely be installed in the network at a particular time, and greedily processing network state transitions to heuristically minimize transition delay. We show a large class of consistency policies are guaranteed by this greedy jeuristic alone; in addition, CCG makes judicious use of existing heavier-weight network update mechanisms to provide guarantees when necessary. As such, CCG nearly achieves the “best of both worlds”: the efficiency of simply passing through updates in most cases, with the consistency guarantees of more heavyweight techniques. Mininet and physical testbed evaluations demonstrate CCG’s capability to achieve various types of consistency, such as path and bandwidth properties, with zero switch memory overhead and up to a 3× delay reduction compared to previous solutions.
Networks are complex and prone to bugs. Existing tools that check configuration files and data-plane state operate offline at timescales of seconds to hours, and cannot detect or prevent bugs as they arise. Is it possible to check network-wide invariants in real time, as the network state evolves? The key challenge here is to achieve extremely low latency during the checks so that network performance is not affected. In this paper, we present a preliminary design, VeriFlow, which suggests that this goal is achievable. VeriFlow is a layer between a software-defined networking controller and network devices that checks for network-wide invariant violations dynamically as each forwarding rule is inserted. Based on an implementation using a Mininet OpenFlow network and Route Views trace data, we find that VeriFlow can perform rigorous checking within hundreds of microseconds per rule insertion.
Presented at NSA SoS Quarterly Meeting, July 2016 and November 2016
We rely on network infrastructure to deliver critical services and ensure security. Yet networks today have reached a level of complexity that is far beyond our ability to have confidence in their correct behavior – resulting in significant time investment and security vulnerabilities that can cost millions of dollars, or worse. Motivated by this need for rigorous understanding of complex networks, I will give an overview of our or Science of Security lablet project, A Hypothesis Testing Framework for Network Security.
First, I will discuss the emerging field of network verification, which transforms network security by rigorously checking that intended behavior is correctly realized across the live running network. Our research developed a technique called data plane verification, which has discovered problems in operational environments and can verify hypotheses and security policies with millisecond-level latency in dynamic networks. In just a few years, data plane verification has moved from early research prototypes to production deployment. We have built on this technique to reason about hypotheses even under the temporal uncertainty inherent in a large distributed network. Second, I will discuss a new approach to reasoning about networks as databases that we can query to determine answers to behavioral questions and to actively control the network. This talk will span work by a large group of folks, including Anduo Wang, Wenxu an Zhou, Dong Jin, Jason Croft, Matthew Caesar, Ahmed Khurshid, and Xuan Zou.
Presented at the Illinois ITI Joint Trust and Security/Science of Security Seminar, September 15, 2015.
Presented to the Illinois SoS Bi-weekly Meeting, April 2015.
Presented at the Illinois SoS Bi-Weekly Meeting, February 2015.
SDN’s logically centralized control provides an insertion point for programming the network. While it is generally agreed that higherlevel abstractions are needed to make that programming easy, there is little consensus on what are the “right” abstractions. Indeed, as SDN moves beyond its initial specialized deployments to broader use cases, it is likely that network control applications will require diverse abstractions that evolve over time. To this end, we champion a perspective that SDN control fundamentally revolves around data representation. We discard any application-specific structure that might be outgrown by new demands. Instead, we adopt a plain data representation of the entire network — network topology, forwarding, and control applications — and seek a universal data language that allows application programmers to transform the primitive representation into any high-level representations presented to applications or network operators. Driven by this insight, we present a system, Ravel, that implements an entire SDN network control infrastructure within a standard SQL database. In Ravel, network abstractions take the form of user-defined SQL views expressed by SQL queries that can be added on the fly. A key challenge in realizing this approach is to orchestrate multiple simultaneous abstractions that collectively affect the same underlying data. To achieve this, Ravel enhances the database with novel data integration mechanisms that merge the multiple views into a coherent forwarding behavior. Moreover, Ravel is exposed to applications through the one simple, familiar and highly interoperable SQL interface. While this is an ambitious long-term goal, our prototype built on the PostgreSQL database exhibits promising performance even for large scale networks.
It is critical to ensure that network policy remains consistent during state transitions. However, existing techniques impose a high cost in update delay, and/or FIB space. We propose the Customizable Consistency Generator (CCG), a fast and generic framework to support customizable consistency policies during network updates. CCG effectively reduces the task of synthesizing an update plan under the constraint of a given consistency policy to a verification problem, by checking whether an update can safely be installed in the network at a particular time, and greedily processing network state transitions to heuristically minimize transition delay. We show a large class of consistency policies are guaranteed by this greedy heuristic alone; in addition, CCG makes judicious use of existing heavier-weight network update mechanisms to provide guarantees when necessary. As such, CCG nearly achieves the “best of both worlds”: the efficiency of simply passing through updates in most cases, with the consistency guarantees of more heavyweight techniques. Mininet and physical testbed evaluations demonstrate CCG’s capability to achieve various types of consistency, such as path and bandwidth properties, with zero switch memory overhead and up to a 3× delay reduction compared to previous solutions.