Biblio
In distributed control systems with shared resources, participating agents can improve the overall performance of the system by sharing data about their personal references. In this paper, we formulate and study a natural tradeoff arising in these problems between the privacy of the agent’s data and the performance of the control system.We formalize privacy in terms of differential privacy of agents’ preference vectors. The overall control system consists of N agents with linear discrete-time coupled dynamics, each controlled to track its preference vector. Performance of the system is measured by the mean squared tracking error. We present a mechanism that achieves differential privacy by adding Laplace noise to the shared information in a way that depends on the sensitivity of the control system to the private data. We show that for stable systems the performance cost of using this type of privacy preserving mechanism grows as O(T3 /Nε2), where T is the time horizon and ε is the privacy parameter. For unstable systems, the cost grows exponentially with time. From an estimation point of view, we establish a lower-bound for the entropy of any unbiased estimator of the private data from any noise-adding mechanism that gives ε-differential privacy. We show that the mechanism achieving this lower-bound is a randomized mechanism that also uses Laplace noise.
The C preprocessor is used in many C projects to support variability and portability. However, researchers and practitioners criticize the C preprocessor because of its negative effect on code understanding and maintainability and its error proneness. More importantly, the use of the preprocessor hinders the development of tool support that is standard in other languages, such as automated refactoring. Developers aggravate these problems when using the preprocessor in undisciplined ways (e.g., conditional blocks that do not align with the syntactic structure of the code). In this article, we proposed a catalogue of refactorings and we evaluated the number of application possibilities of the refactorings in practice, the opinion of developers about the usefulness of the refactorings, and whether the refactorings preserve behavior. Overall, we found 5670 application possibilities for the refactorings in 63 real-world C projects. In addition, we performed an online survey among 246 developers, and we submitted 28 patches to convert undisciplined directives into disciplined ones. According to our results, 63% of developers prefer to use the refactored (i.e., disciplined) version of the code instead of the original code with undisciplined preprocessor usage. To verify that the refactorings are indeed behavior preserving, we applied them to more than 36 thousand programs generated automatically using a model of a subset of the C language, running the same test cases in the original and refactored programs. Furthermore, we applied the refactorings to three real-world projects: BusyBox, OpenSSL, and SQLite. This way, we detected and fixed a few behavioral changes, 62% caused by unspecified behavior in the C language.
Large software systems have to contend with a significant number of users who interact with different components of the system in various ways. The sequences of components that are used as part of an interaction define sets of behaviors that users have with the system. These can be large in number. Among these users, it is possible that there are some who exhibit anomalous behaviors -- for example, they may have found back doors into the system and are doing something malicious. These anomalous behaviors can be hard to distinguish from normal behavior because of the number of interactions a system may have, or because traces may deviate only slightly from normal behavior. In this paper we describe a model-based approach to cluster sequences of user behaviors within a system and to find suspicious, or anomalous, sequences. We exploit the underlying software architecture of a system to define these sequences. We further show that our approach is better at detecting suspicious activities than other approaches, specifically those that use unigrams and bigrams for anomaly detection. We show this on a simulation of a large scale system based on Amazon Web application style architecture.
Software defined networking is a rapidly expanding networking paradigm that aims to separate the control logic from the forwarding devices. Through centralized control, network operators are able to deploy and manage more efficient forwarding strategies. Traditionally, when the network undergoes a change through maintenance, failure, or cyber attack, the centralized controller processes these events and deploys new forwarding rules reactively. This work provides a strategy that does not require a controller in order to maintain connectivity while only using features within the existing OpenFlow protocol version 1.3 or greater. In this paper we illustrate why forwarding resiliency is desired in OpenFlow networks and provide an algorithm that computes the flow entries required to achieve maximal forwarding resiliency in presence of both multiple link and controller failures on any arbitrary network.
Attack graphs used in network security analysis are analyzed to determine sequences of exploits that lead to successful acquisition of privileges or data at critical assets. An attack graph edge corresponds to a vulnerability, tacitly assuming a connection exists and tacitly assuming the vulnerability is known to exist. In this paper we explore use of uncertain graphs to extend the paradigm to include lack of certainty in connection and/or existence of a vulnerability. We extend the standard notion of uncertain graph (where the existence of each edge is probabilistically independent) however, as significant correlations on edge existence probabilities exist in practice, owing to common underlying causes for dis-connectivity and/or presence of vulnerabilities. Our extension describes each edge probability as a Boolean expression of independent indicator random variables. This paper (i) shows that this formalism is maximally descriptive in the sense that it can describe any joint probability distribution function of edge existence, (ii) shows that when these Boolean expressions are monotone then we can easily perform uncertainty analysis of edge probabilities, and (iii) uses these results to model a partial attack graph of the Stuxnet worm and a small enterprise network and to answer important security-related questions in a probabilistic manner.
Scientific advancement is fueled by solid fundamental research, followed by replication, meta-analysis, and theory building. To support such advancement, researchers and government agencies have been working towards a "science of security". As in other sciences, security science requires high-quality fundamental research addressing important problems and reporting approaches that capture the information necessary for replication, meta-analysis, and theory building. The goal of this paper is to aid security researchers in establishing a baseline of the state of scientific reporting in security through an analysis of indicators of scientific research as reported in top security conferences, specifically the 2015 ACM CCS and 2016 IEEE S&P proceedings. To conduct this analysis, we employed a series of rubrics to analyze the completeness of information reported in papers relative to the type of evaluation used (e.g. empirical study, proof, discussion). Our findings indicated some important information is often missing from papers, including explicit documentation of research objectives and the threats to validity. Our findings show a relatively small number of replications reported in the literature. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
Attack graphs used in network security analysis are analyzed to determine sequences of exploits that lead to successful acquisition of privileges or data at critical assets. An attack graph edge corresponds to a vulnerability, tacitly assuming a connection exists and tacitly assuming the vulnerability is known to exist. In this paper we explore use of uncertain graphs to extend the paradigm to include lack of certainty in connection and/or existence of a vulnerability. We extend the standard notion of uncertain graph (where the existence of each edge is probabilistically independent) however, as signicant correlations on edge existence probabilities exist in practice, owing to common underlying causes for dis-connectivity and/or presence of vulnerabilities. Our extension describes each edge probability as a Boolean expression of independent indicator random variables. This paper (i) shows that this formalism is maximally descriptive in the sense that it can describe any joint probability distribution function of edge existence, (ii) shows that when these Boolean expressions are monotone then we can easily perform uncertainty analysis of edge probabilities, and (iii) uses these results to model a partial attack graph of the Stuxnet worm and a small enterprise network and to answer important security-related questions in a probabilistic manner.
Scientific advancement is fueled by solid fundamental research, followed by replication, meta-analysis, and theory building. To support such advancement, researchers and government agencies have been working towards a "science of security". As in other sciences, security science requires high-quality fundamental research addressing important problems and reporting approaches that capture the information necessary for replication, meta-analysis, and theory building. The goal of this paper is to aid security researchers in establishing a baseline of the state of scientific reporting in security through an analysis of indicators of scientific research as reported in top security conferences, specifically the 2015 ACM CCS and 2016 IEEE S&P proceedings. To conduct this analysis, we employed a series of rubrics to analyze the completeness of information reported in papers relative to the type of evaluation used (e.g. empirical study, proof, discussion). Our findings indicated some important information is often missing from papers, including explicit documentation of research objectives and the threats to validity. Our findings show a relatively small number of replications reported in the literature. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
A key use of software-defined networking is to enable scaleout of network data plane elements. Naively scaling networking elements, however, can cause incorrect behavior. For example, we show that an IDS system which operates correctly as a single network element can erroneously and permanently block hosts when it is replicated.
In this paper, we provide a system, COCONUT, for seamless scale-out of network forwarding elements; that is, an SDN application programmer can program to what functionally appears to be a single forwarding element, but whichmay be replicated behind the scenes. To do this, we identifythe key property for seamless scale out, weak causality,and guarantee it through a practical and scalable implementation of vector clocks in the data plane. We prove that COCONUT enables seamless scale out of networking elements, i.e., the user-perceived behavior of any COCONUT element implemented with a distributed set of concurrent replicas is provably indistinguishable from its singleton implementation. Finally, we build a prototype of COCONUT and experimentally demonstrate its correct behavior. We also show that its abstraction enables a more efficient implementation of seamless scale-out compared to a naive baseline.
A key use of software-defined networking is to enable scale-out of network data plane elements. Naively scaling networking elements, however, can cause incorrect security responses. For example, we show that an IDS system which operates correctly as a single network element can erroneously and permanently block hosts when it is replicated. Similarly, a scaled-out firewall can incorrectly block hosts.
In this paper, we provide a system, COCONUT, for seamless scale-out of network forwarding elements; that is, an SDN application programmer can program to what functionally appears to be a single forwarding element, but which may be replicated behind the scenes. To do this, we identify the key property for seamless scale out, weak causality, and guarantee it through a practical and scalable implementation of vector clocks in the data plane. We formally prove that COCONUT enables seamless scale out of networking elements, i.e., the user-perceived behavior of any COCONUT element implemented with a distributed set of concurrent replicas is provably indistinguishable from its singleton implementation. Finally, we build a prototype of COCONUT and experimentally demonstrate its correct behavior. We also show that its abstraction enables a more efficient implementation of seamless scale-out compared to a naive baseline.
This work was funded by the SoS lablet at the University of Illinois at Urbana-Champaign.
Modern mobile platforms rely on a permission model to guard the system's resources and apps. In Android, since the permissions are granted at the granularity of apps, and all components belonging to an app inherit those permissions, an app's components are typically over-privileged, i.e., components are granted more privileges than they need to complete their tasks. Systematic violation of least-privilege principle in Android has shown to be the root cause of many security vulnerabilities. To mitigate this issue, we have developed DELDROID, an automated system for determination of least privilege architecture in Android and its enforcement at runtime. A key contribution of our approach is the ability to limit the privileges granted to apps without the need to modify them. DELDROID utilizes static program analysis techniques to extract the exact privileges each component needs for providing its functionality. A Multiple-Domain Matrix representation of the system's architecture is then used to automatically analyze the security posture of the system and derive its least-privilege architecture. Our experiments on hundreds of real world apps corroborate DELDROID's ability in effectively establishing the least-privilege architecture and its benefits in alleviating the security threats.
Information security can benefit from real-time cyber threat indicator sharing, in which companies and government agencies share their knowledge of emerging cyberattacks to benefit their sector and society at large. As attacks become increasingly sophisticated by exploiting behavioral dimensions of human computer operators, there is an increased risk to systems that store personal information. In addition, risk increases as individuals blur the boundaries between workplace and home computing (e.g., using workplace computers for personal reasons). This paper describes an architecture to leverage individual perceptions of privacy risk to compute privacy risk scores over cyber threat indicator data. Unlike security risk, which is a risk to a particular system, privacy risk concerns an individual's personal information being accessed and exploited. The architecture integrates tools to extract information entities from textual threat reports expressed in the STIX format and privacy risk estimates computed using factorial vignettes to survey individual risk perceptions. The architecture aims to optimize for scalability and adaptability to achieve real-time risk scoring.
Enterprise networks today have highly diverse correctness requirements and relatively common performance objectives. As a result, preferred abstractions for enterprise networks are those which allow matching security and correctness specifications, while transparently managing performance. Existing SDN network management architectures, however, bundle correctness and performance as a single abstraction. We argue that this creates an SDN ecosystem that is unnecessarily hard to build, maintain and evolve. We advocate a separation of the diverse correctness abstractions from generic performance optimization, to enable easier evolution of SDN controllers and platforms. We propose Oreo, a first step towards a common and relatively transparent performance optimization layer for SDN. Oreo performs the optimization by first building a model that describes every flow in the network, and then performing network-wide, multi-objective optimization based on this model without disrupting higher level security and correctness.
Authors: Santhosh Prabhu, Mo Dong, Tong Meng, P. Brighten Godfrey, and Matthew Caesar
Enterprise networks today have highly diverse correctness requirements and relatively common performance objectives. As a result, preferred abstractions for enterprise networks are those which allow matching correctness specification, while transparently managing performance. Existing SDN network management architectures, however, bundle correctness and performance as a single abstraction. We argue that this creates an SDN ecosystem that is unnecessarily hard to build, maintain and evolve. We advocate a separation of the diverse correctness abstractions from generic performance optimization, to enable easier evolution of SDN controllers and platforms. We propose Oreo, a first step towards a common and relatively transparent performance optimization layer for SDN. Oreo performs the optimization by first building a model that describes every flow in the network, and then performing network-wide, multi-objective optimization based on this model without disrupting higher level correctness.
State estimation is a fundamental problem for monitoring and controlling systems. Engineering systems interconnect sensing and computing devices over a shared bandwidth-limited channels, and therefore, estimation algorithms should strive to use bandwidth optimally. We present a notion of entropy for state estimation of switched nonlinear dynamical systems, an upper bound for it and a state estimation algorithm for the case when the switching signal is unobservable. Our approach relies on the notion of topological entropy and uses techniques from the theory for control under limited information. We show that the average bit rate used is optimal in the sense that, the efficiency gap of the algorithm is within an additive constant of the gap between estimation entropy of the system and its known upper-bound. We apply the algorithm to two system models and discuss the performance implications of the number of tracked modes.
Race conditions are difficult to detect because they usually occur only under specific execution interleavings. Numerous program analysis and testing techniques have been proposed to detect race conditions between threads on single applications. However, most of these techniques neglect races that occur at the process level due to complex system event interactions. This article presents a framework, SIMEXPLORER, that allows engineers to effectively test for process-level race conditions. SIMEXPLORER first uses dynamic analysis techniques to observe system execution, identify program locations of interest, and report faults related to oracles. Next, it uses virtualization to achieve the fine-grained controllability needed to exercise event interleavings that are likely to expose races. We evaluated the effectiveness of SIMEXPLORER on 24 real-world applications containing both known and unknown process-level race conditions. Our results show that SIMEXPLORER is effective at detecting these race conditions, while incurring an overhead that is acceptable given its effectiveness improvements.
Many authentication schemes ask users to manually compare compact representations of cryptographic keys, known as fingerprints. If the fingerprints do not match, that may signal a man-in-the-middle attack. An adversary performing an attack may use a fingerprint that is similar to the target fingerprint, but not an exact match, to try to fool inattentive users. Fingerprint representations should thus be both usable and secure. We tested the usability and security of eight fingerprint representations under different configurations. In a 661-participant between-subjects experiment, participants compared fingerprints under realistic conditions and were subjected to a simulated attack. The best configuration allowed attacks to succeed 6% of the time; the worst 72%. We find the seemingly effective compare-and-select approach performs poorly for key fingerprints and that graphical fingerprint representations, while intuitive and fast, vary in performance. We identify some fingerprint representations as particularly promising.
When multiple apps on an Android platform interact, faults and security vulnerabilities can occur. Software engineers need to be able to analyze interacting apps to detect such problems. Current approaches for performing such analyses, however, do not scale to the numbers of apps that may need to be considered, and thus, are impractical for application to realworld scenarios. In this paper, we introduce JITANA, a program analysis framework designed to analyze multiple Android apps simultaneously. By using a classloader-based approach instead of a compiler-based approach such as SOOT, JITANA is able to simultaneously analyze large numbers of interacting apps, perform on-demand analysis of large libraries, and effectively analyze dynamically generated code. Empirical studies of JITANA show that it is substantially more efficient than a state-of-theart approach, and that it can effectively and efficiently analyze complex apps including Facebook, Pokemon Go, and Pandora ´ that the state-of-the-art approach cannot handle.
Non-volatile memory express (NVMe) based SSDs and the NUMA platform are widely adopted in servers to achieve faster storage speed and more powerful processing capability. As of now, very little research has been conducted to investigate the performance and energy efficiency of the stateof-the-art NUMA architecture integrated with NVMe SSDs, an emerging technology used to host parallel I/O threads. As this technology continues to be widely developed and adopted, we need to understand the runtime behaviors of such systems in order to design software runtime systems that deliver optimal performance while consuming only the necessary amount of energy. This paper characterizes the runtime behaviors of a Linuxbased NUMA system employing multiple NVMe SSDs. Our comprehensive performance and energy-efficiency study using massive numbers of parallel I/O threads shows that the penalty due to CPU contention is much smaller than that due to remote access of NVMe SSDs. Based on this insight, we develop a dynamic “lesser evil” algorithm called ESN, to minimize the impact of these two types of penalties. ESN is an energyefficient profiling-based I/O thread scheduler for managing I/O threads accessing NVMe SSDs on NUMA systems. Our empirical evaluation shows that ESN can achieve optimal I/O throughput and latency while consuming up to 50% less energy and using fewer CPUs.
Though immutability has been long-proposed as a way to prevent bugs in software, little is known about how to make immutability support in programming languages effective for software engineers. We designed a new formalism that extends Java to support transitive class immutability, the form of immutability for which there is the strongest empirical support, and implemented that formalism in a tool called Glacier. We applied Glacier successfully to two real-world systems. We also compared Glacier to Java’s final in a user study of twenty participants. We found that even after being given instructions on how to express immutability with final, participants who used final were unable to express immutability correctly, whereas almost all participants who used Glacier succeeded. We also asked participants to make specific changes to immutable classes and found that participants who used final all incorrectly mutated immutable state, whereas almost all of the participants who used Glacier succeeded. Glacier represents a promising approach to enforcing immutability in Java and provides a model for enforcement in other languages.
Software-defined networking (SDN) technology promises centralized and rapid network provisioning, holistic management, low operational cost, and improved network visibility. Researchers have developed multiple SDN simulation and emulation platforms to expedite the adoption of many emerging SDN-based applications to production systems. However, the scalability of those platforms is often limited by the underlying physical hardware resources, which inevitably affects the simulation delity in large-scale network settings. In this paper, we present a model abstraction technique that e ectively transforms the network devices in an SDN-based network to one virtualized switch model. While signi cantly reducing the model execution time and enabling the real-time simulation capability, our abstracted model also preserves the end-to-end forwarding behavior of the original network. To achieve this, we first classify packets with the same forwarding behavior into smaller and disjoint Equivalence Classes (ECes) by analyzing the OpenFlow rules installed on the SDN devices. We then create a graph model representing the forwarding behavior of each EC. By traversing those graphs, we nally construct the rules of the big-switch model to e ectively preserve the original network's end-to-end forwarding behavior. Experimental results demonstrate that the network forwarding logic equivalence is well preserved between the abstracted model and the original SDN network. The model abstraction process is fast, e.g., 3.15 seconds to transform a medium-scale tree network consisting of 53,260 rules. The big-switch model is able to speed up the simulation by 4.3 times in average and up to 6.69 times among our evaluation experiments.
Certification schemes exist to regulate software systems and prevent them from being deployed before they are judged fit to use. However, practitioners are often unsatisfied with the efficiency of certification standards and processes. In this study, we analyzed two certification standards, Common Criteria and DO-178C, and collected insights from literature and from interviews with subject-matter experts to identify concepts affecting the efficiency of certification processes. Our results show that evaluation time, reusability of evaluation artifacts, and composition of systems and certified artifacts are barriers to achieve efficient certification.
Programming language definitions assign formal meaning to complete programs. Programmers, however, spend a substantial amount of time interacting with incomplete programs -- programs with holes, type inconsistencies and binding inconsistencies -- using tools like program editors and live programming environments (which interleave editing and evaluation). Semanticists have done comparatively little to formally characterize (1) the static and dynamic semantics of incomplete programs; (2) the actions available to programmers as they edit and inspect incomplete programs; and (3) the behavior of editor services that suggest likely edit actions to the programmer based on semantic information extracted from the incomplete program being edited, and from programs that the system has encountered in the past. As such, each tool designer has largely been left to develop their own ad hoc heuristics.
This paper serves as a vision statement for a research program that seeks to develop these "missing" semantic foundations. Our hope is that these contributions, which will take the form of a series of simple formal calculi equipped with a tractable metatheory, will guide the design of a variety of current and future interactive programming tools, much as various lambda calculi have guided modern language designs. Our own research will apply these principles in the design of Hazel, an experimental live lab notebook programming environment designed for data science tasks. We plan to co-design the Hazel language with the editor so that we can explore concepts such as edit-time semantic conflict resolution mechanisms and mechanisms that allow library providers to install library-specific editor services.
The principle of least authority states that each component of the system should be given authority to access only the information and resources that it needs for its operation. This principle is fundamental to the secure design of software systems, as it helps to limit an application’s attack surface and to isolate vulnerabilities and faults. Unfortunately, current programming languages do not provide adequate help in controlling the authority of application modules, an issue that is particularly acute in the case of untrusted third-party extensions. In this paper, we present a language design that facilitates controlling the authority granted to each application module. The key technical novelty of our approach is that modules are firstclass, statically typed capabilities. First-class modules are essentially objects, and so we formalize our module system by translation into an object calculus and prove that the core calculus is typesafe and authority-safe. Unlike prior formalizations, our work defines authority non-transitively, allowing engineers to reason about software designs that use wrappers to provide an attenuated version of a more powerful capability. Our approach allows developers to determine a module’s authority by examining the capabilities passed as module arguments when the module is created, or delegated to the module later during execution. The type system facilitates this by identifying which objects provide capabilities to sensitive resources, and by enabling security architects to examine the capabilities passed into and out of a module based only on the module’s interface, without needing to examine the module’s implementation code. An implementation of the module system and illustrative examples in the Wyvern programming language suggest that our approach can be a practical way to control module authority.
In a discrete-time linear multi-agent control system, where the agents are coupled via an environmental state, knowledge of the environmental state is desirable to control the agents locally. However, since the environmental state depends on the behavior of the agents, sharing it directly among these agents jeopardizes the privacy of the agents' pro les, de ned as the combination of the agents' initial states and the sequence of local control inputs over time. A commonly used solution is to randomize the environmental state before sharing { this leads to a natural trade-o between the privacy of the agents' pro les and the variance of estimating the environmental state. By treating the multi-agent system as a probabilistic model of the environmental state parametrized by the agents' pro les, we show that when the agents' pro les is "-di erentially private, there is a lower bound on the `1 induced norm of the covariance matrix of the minimum-variance unbiased estimator of the environmental state. This lower bound is achieved by a randomized mechanism that uses Laplace noise.