Biblio
Sandboxes impose a security policy, isolating applications and their components from the rest of a system. While many sandboxing techniques exist, state of the art sandboxes generally perform their functions within the system that is being defended. As a result, when the sandbox fails or is bypassed, the security of the surrounding system can no longer be assured. We experiment with the idea of in-nimbo sandboxing, encapsulating untrusted computations away from the system we are trying to protect. The idea is to delegate computations that may be vulnerable or malicious to virtual machine instances in a cloud computing environment.
This may not reduce the possibility of an in-situ sandbox compromise, but it could significantly reduce the consequences should that possibility be realized. To achieve this advantage, there are additional requirements, including: (1) A regulated channel between the local and cloud environments that supports interaction with the encapsulated application, (2) Performance design that acceptably minimizes latencies in excess of the in-situ baseline.
To test the feasibility of the idea, we built an in-nimbo sandbox for Adobe Reader, an application that historically has been subject to significant attacks. We undertook a prototype deployment with PDF users in a large aerospace firm. In addition to thwarting several examples of existing PDF-based malware, we found that the added increment of latency, perhaps surprisingly, does not overly impair the user experience with respect to performance or usability.
Self-adaptive systems have the ability to adapt their behavior to dynamic operation conditions. In reaction to changes in the environment, these systems determine the appropriate corrective actions based in part on information about which action will have the best impact on the system. Existing models used to describe the impact of adaptations are either unable to capture the underlying uncertainty and variability of such dynamic environments, or are not compositional and described at a level of abstraction too low to scale in terms of specification effort required for non-trivial systems. In this paper, we address these shortcomings by describing an approach to the specification of impact models based on architectural system descriptions, which at the same time allows us to represent both variability and uncertainty in the outcome of adaptations, hence improving the selection of the best corrective action. The core of our approach is an impact model language equipped with a formal semantics defined in terms of Discrete Time Markov Chains. To validate our approach, we show how employing our language can improve the accuracy of predictions used for decisionmaking in the Rainbow framework for architecture-based self-adaptation.
In prior research we have developed a Curry-Howard interpretation of linear sequent calculus as session-typed processes. In this paper we uniformly integrate this computational interpretation in a functional language via a linear contextual monad that isolates session-based concurrency. Monadic values are open process expressions and are first class objects in the language, thus providing a logical foundation for higher-order session typed processes. We illustrate how the combined use of the monad and recursive types allows us to cleanly write a rich variety of concurrent programs, including higher-order programs that communicate processes. We show the standard metatheoretic result of type preservation, as well as a global progress theorem, which to the best of our knowledge, is new in the higher-order session typed setting.
Identifying factors behind countries’ weakness to cyber-attacks is an important step towards addressing these weaknesses at the root level. For example, identifying factors why some countries become cyber- crime safe heavens can inform policy actions about how to reduce the attractiveness of these countries to cyber-criminals. Currently, however, identifying these factors is mostly based on expert opinions and speculations.
In this work, we perform an empirical study to statistically test the validity of these opinions and specu- lations. In our analysis, we use Symantec’s World Intelligence Network Environment (WINE) Intrusion Prevention System (IPS) telemetry data which contain attack reports from more than 10 million customer computers worldwide. We use regression analysis to test for the relevance of multiple factors including monetary and computing resources, cyber-security research and institutions, and corruption.
Our analysis confirms some hypotheses and disproves others. We find that many countries in Eastern Europe extensively host attacking computers because of a combination of good computing infrastructure and high corruption rate. We also find that web attacks and fake applications are most prevalent in rich countries because attacks on these countries are more lucrative. Finally, we find that computers in Africa launch the lowest rates of cyber-attacks. This is surprising given the bad cyber reputation of some African countries such as Nigeria. Our research has many policy implications.
Finding architectural flaws in object-oriented code requires a runtime architecture that shows multiple components of the same type that are used in different contexts. Previous work showed that a runtime architecture can be approximated by an abstract object graph that a static analysis extracts from code with Ownership Domain annotations. To find architectural flaws, it is not enough to reason about the presence or absence of communication. Additional work is needed to reason about the content of the communication. The contribution of this paper is a static analysis that extracts a hierarchical object graph with dataflow edges that refer to objects. The extraction analysis combines the aliasing precision provided by Ownership Domains with a domainsensitive value flow analysis. We evaluate the extraction analysis on an open-source Android application and discuss examples of dataflow edges that refer to objects that are in actual domains or to flow objects that are in domains corresponding to unique annotations.
In our experience, exploratory testing has reached a level of maturity that makes it a practical and often the most cost-effective approach to testing. Notably, previous work has demonstrated that exploratory testing is capable of finding bugs even in well-tested systems [4, 17, 24, 23]. However, the number of bugs found gives little indication of the efficiency of a testing approach. To drive testing efficiency, this paper focuses on techniques for measuring and maximizing the coverage achieved by exploratory testing. In particular, this paper describes the design, implementation, and evaluation of Eta, a framework for exploratory testing of multithreaded components of a large-scale cluster management system at Google. For simple tests (with millions to billions of possible executions), Eta achieves complete coverage one to two orders of magnitude faster than random testing. For complex tests, Eta adopts a state space reduction technique to avoid the need to explore over 85% of executions and harnesses parallel processing to explore multiple test executions concurrently, achieving a throughput increase of up to 17.5×.
We investigate a notion of behavioral genericity in the context of session type disciplines. To this end, we develop a logically motivated theory of parametric polymorphism, reminiscent of the Girard-Reynolds polymorphic λ-calculus, but casted in the setting of concurrent processes. In our theory, polymorphism accounts for the exchange of abstract communication protocols and dynamic instantiation of heterogeneous interfaces, as opposed to the exchange of data types and dynamic instantiation of individual message types. Our polymorphic session-typed process language satisfies strong forms of type preservation and global progress, is strongly normalizing, and enjoys a relational parametricity principle. Combined, our results confer strong correctness guarantees for communicating systems. In particular, parametricity is key to derive non-trivial results about internal protocol independence, a concurrent analogous of representation independence, and non-interference properties of modular, distributed systems.
To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram that shows one box for a class to represent all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.
Most accounts of information flow security in pro- gramming languages emphasize non-interference to characterize security: in a secure program, changes to high-security inputs do not alter the values of low-security outputs. The definition of non-interference is incompatible with declassification, which allows some low-security outputs to be influenced by high-security inputs. We propose an alternative account of information flow based on an epistemic logic of computational effects. Rather than view a program as a function from inputs to outputs, we instead embrace the principle that information flow security is concerned with the effects a program has on its execution environment. These effects are modelled using a substructural epistemic logic that tracks the flow of knowledge gained by principals and communication channels during execution. Confidentiality is expressed by proving necessary conditions for a principal to know a sensitive fact at the end of an execution. In the simplest case the necessary condition is falsehood, which means that a principal cannot know a secret as a result of a well-typed execution of a program. In the presence of declassification a necessary condition for disclosure is the existence of a proof of authorization in a formal authorization logic, expressing that sensitive data is disclosed only when explicitly authorized. Rather than taken as the primary result, the classical non-interference property arises in the proof of adequacy of the epistemic theory of disclosure, ensuring that it accurately models program behavior. It is suggested that an epistemic account of information flow security is both more natural and more expressive than classical accounts based only on non-interference.
Complementary problems play a central role in equilibrium finding, physical sim- ulation, and optimization. As a consequence, we are interested in understanding how to solve these problems quickly, and this often involves approximation. In this paper we present a method for approximately solving strictly monotone linear complementarity problems with a Galerkin approximation. We also give bounds for the approximate error, and prove novel bounds on perturbation error. These perturbation bounds suggest that a Galerkin approximation may be much less sen- sitive to noise than the original LCP.
Both SAT and #SAT can represent difficult problems in seemingly dissimilar areas such as planning, verification, and probabilistic inference. Here, we examine an expressive new language, #∃SAT, that generalizes both of these languages. #∃SAT problems require counting the number of satisfiable formulas in a concisely-describable set of existentially quantified, propositional formulas. We characterize the expressiveness and worst-case difficulty of #∃SAT by proving it is complete for the complexity class #P NP [1], and re- lating this class to more familiar complexity classes. We also experiment with three new
general-purpose #∃SAT solvers on a battery of problem distributions including a simple logistics domain. Our experiments show that, despite the formidable worst-case complex-
ity of #P NP [1], many of the instances can be solved efficiently by noticing and exploiting a particular type of frequent structure.
Smartphone users often use private and enterprise data with untrusted third party applications. The fundamental lack of secrecy guarantees in smartphone OSes, such as Android, exposes this data to the risk of unauthorized exfiltration. A natural solution is the integration of secrecy guarantees into the OS. In this paper, we describe the challenges for decentralized information flow control (DIFC) enforcement on Android. We propose context-sensitive DIFC enforcement via lazy polyinstantiation and practical and secure network export through domain declassification. Our DIFC system, Weir, is backwards compatible by design, and incurs less than 4 ms overhead for component startup. With Weir, we demonstrate practical and secure DIFC enforcement on Android.
In a multiagent system, a (social) norm describes what the agents may expect from each other. Norms promote autonomy (an agent need not comply with a norm) and heterogeneity (a norm describes interactions at a high level independent of implementation details). Researchers have studied norm emergence through social learning where the agents interact repeatedly in a graph structure.
In contrast, we consider norm emergence in an open system, where membership can change, and where no predetermined graph structure exists. We propose Silk, a mechanism wherein a generator monitors interactions among member agents and recommends norms to help resolve conflicts. Each member decides on whether to accept or reject a recommended norm. Upon exiting the system, a member passes its experience along to incoming members of the same type. Thus, members develop norms in a hybrid manner to resolve conflicts.
We evaluate Silk via simulation in the traffic domain. Our results show that social norms promoting conflict resolution emerge in both moderate and selfish societies via our hybrid mechanism.
To interact effectively, agents must enter into commitments. What should an agent do when these commitments conflict? We describe Coco, an approach for reasoning about which specific commitments apply to specific parties in light of general types of commitments, specific circumstances, and dominance relations among specific commitments. Coco adapts answer-set programming to identify a maximalsetofnondominatedcommitments. It provides a modeling language and tool geared to support practical applications.
Privacy remains a major challenge today partly because it brings together social and technical considerations. Yet, current software engineering focuses only on the technical aspects. In contrast, our approach, Revani, understands privacy from the standpoint of sociotechnical systems (STSs), with particular attention on the social elements of STSs. We specify STSs via a combination of technical mechanisms and social norms founded on accountability.
Revani provides a way to formally represent mechanisms and norms, and applies model checking to verify whether specified mechanisms and norms would satisfy the requirements of the stakeholders. Additionally, Revani provides a set of design patterns and a revision tool to update an STS specification as necessary. We demonstrate the working of Revani on a healthcare emergency use case pertaining to disasters.
Recent data breaches in domains such as healthcare, where confidentiality of data is crucial, indicate that misuse cases often originate from user errors rather than vulnerabilities in the technical (software or hardware) architecture. Current requirements engineering (RE) approaches determine what access control mechanisms are needed to protect sensitive resources. However, current RE approaches inadequately characterize how a user is expected to interact with others in relation to the relevant resources. Consequently, a requirements analyst cannot readily identify the vulnerabilities based on user interactions. We adopt social norms as a natural, formal means of characterizing user interactions wherein potential misuses map to norm violations. Our research goal is to help analysts identify misuse cases by systematically generating potential temporal enactments that violate formally stated social norms. We propose Nane: a formal framework for identifying misuse cases from norm enactments. We represent misuse cases formally, and propose a semiautomated process for identifying misuse cases based on norm enactments. We show that our process is sound and complete with respect to the stated norms. We discuss the expressiveness of our representation, and demonstrate how Nane enables monitoring of misuse cases via temporal reasoning.
In an organization, the interactions users have with software leave patterns or traces of the parts of the systems accessed. These interactions can be associated with the underlying software architecture. The first step in detecting problems like insider threat is to detect those traces that are anomalous. Here, we propose a method to find anomalous users leveraging these interaction traces, categorized by user roles. We propose a model based approach to cluster user sequences and find outliers. We show that the approach works on a simulation of a large scale system based on and Amazon Web application style.
We propose an interactive approach where analysts reason about the security of a system using an abstraction of its runtime structure, as opposed to looking at the code. They interactively refine a hierarchical object graph, set security properties on abstract objects or edges, query the graph, and investigate the results by studying highlighted objects or edges or tracing to the code. Behind the scenes, an inference analysis and an extraction analysis maintain the soundness of the graph with respect to the code.
Conventional security mechanisms at network, host, and source code levels are no longer sufficient in detecting and responding to increasingly dynamic and sophisticated cyber threats today. Detecting anomalous behavior at the architectural level can help better explain the intent of the threat and strengthen overall system security posture. To that end, we present a framework that mines software component interactions from system execution history and applies a detection algorithm to identify anomalous behavior. The framework uses unsupervised learning at runtime, can perform fast anomaly detection “on the fly”, and can quickly adapt to system load fluctuations and user behavior shifts. Our evaluation of the approach against a real Emergency Deployment System has demonstrated very promising results, showing the framework can effectively detect covert attacks, including insider threats, that may be easily missed by traditional intrusion detection methods.
Modern frameworks are required to be extendable as well as secure. However, these two qualities are often at odds. In this poster we describe an approach that uses a combination of static analysis and run-time management, based on software architecture models, that can improve security while maintaining framework extendability. We implement a prototype of the approach for the Android platform. Static analysis identifies the architecture and communication patterns among the collection of apps on an Android device and which communications might be vulnerable to attack. Run-time mechanisms monitor these potentially vulnerable communication patterns, and adapt the system to either deny them, request explicit approval from the user, or allow them.
It is more expensive and time consuming to build modern software without extensive supply chains. Supply chains decrease these development risks, but typically at the cost of increased security risk. In particular, it is often difficult to understand or verify what a software component delivered by a third party does or could do. Such a component could contain unwanted behaviors, vulnerabilities, or malicious code, many of which become incorporated in applications utilizing the component. Sandboxes provide relief by encapsulating a component and imposing a security policy on it. This limits the operations the component can perform without as much need to trust or verify the component. Instead, a component user must trust or verify the relatively simple sandbox. Given this appealing prospect, researchers have spent the last few decades developing new sandboxing techniques and sandboxes. However, while sandboxes have been adopted in practice, they are not as pervasive as they could be. Why are sandboxes not achieving ubiquity at the same rate as extensive supply chains? This thesis advances our understanding of and overcomes some barriers to sandbox adoption. We systematically analyze ten years (2004 – 2014) of sandboxing research from top-tier security and systems conferences. We uncover two barriers: (1) sandboxes are often validated using relatively subjective techniques and (2) usability for sandbox deployers is often ignored by the studied community. We then focus on the Java sandbox to empirically study its use within the open source community. We find features in the sandbox that benign applications do not use, which have promoted a thriving exploit landscape. We develop run time monitors for the Java Virtual Machine (JVM) to turn off these features, stopping all known sandbox escaping JVM exploits without breaking benign applications. Furthermore, we find that the sandbox contains a high degree of complexity benign applications need that hampers sandbox use. When studying the sandbox’s use, we did not find a single application that successfully deployed the sandbox for security purposes, which motivated us to overcome benignly-used complexity via tooling. We develop and evaluate a series of tools to automate the most complex tasks, which currently require error-prone manual effort. Our tools help users derive, express, and refine a security policy and impose it on targeted Java application JARs and classes. This tooling is evaluated through case studies with industrial collaborators where we sandbox components that were previously difficult to sandbox securely. Finally, we observe that design and implementation complexity causes sandbox developers to accidentally create vulnerable sandboxes. Thus, we develop and evaluate a sandboxing technique that leverages existing cloud computing environments to execute untrusted computations. Malicious outcomes produced by the computations are contained by ephemeral virtual machines. We describe a field trial using this technique with Adobe Reader and compare the new sandbox to existing sandboxes using a qualitative framework we developed.
Recent studies have found that parallel garbage collection performs worse with more CPUs and more collector threads. As part of this work, we further investigate this enomenon and find that poor scalability is worst in highly scalable Java applications. Our investigation to find the causes clearly reveals that efficient multi-threading in an application can prolong the average object lifespan, which results in less effective garbage collection. We also find that prolonging lifespan is the direct result of Linux's Completely Fair Scheduler due to its round-robin like behavior that can increase the heap contention between the application threads. Instead, if we use pseudo first-in-first-out to schedule application threads in large multicore systems, the garbage collection scalability is significantly improved while the time spent in garbage collection is reduced by as much as 21%. The average execution time of the 24 Java applications used in our study is also reduced by 11%. Based on this observation, we propose two approaches to optimally select scheduling policies based on application scalability profile. Our first approach uses the profile information from one execution to tune the subsequent executions. Our second approach dynamically collects profile information and performs policy selection during execution.
Software developers use #ifdef statements to support code configurability, allowing software product diversification. But because functions can be in many executions paths that depend on complex combinations of configuration options, the introduction of an #ifdef for a given purpose (such as adding a new feature to a program) can enable unintended function calls, which can be a source of vulnerabilities. Part of the difficulty lies in maintaining mental models of all dependencies. We propose analytic visualizations of thevariational callgraph to capture dependencies across configurations and create visualizations to demonstrate how it would help developers visually reason through the implications of diversification, for example through visually doing change impact analysis.