Visible to the public Biblio

Found 164 results

Filters: Keyword is CMU  [Clear All Filters]
2016-12-05
Filipre Militao, Jonathan Aldrich, Luis Caires.  2014.  Rely-Guarantee Protocols. Proceedings of the 28th European Conference on ECOOP 2014 --- Object-Oriented Programming. 8586

The use of shared mutable state, commonly seen in object-oriented systems, is often problematic due to the potential conflicting interactions between aliases to the same state. We present a substructural type system outfitted with a novel lightweight interference control mechanism, rely-guarantee protocols, that enables controlled aliasing of shared resources. By assigning each alias separate roles, encoded in a novel protocol abstraction in the spirit of rely-guarantee reasoning, our type system ensures that challenging uses of shared state will never interfere in an unsafe fashion. In particular, rely-guarantee protocols ensure that each alias will never observe an unexpected value, or type, when inspecting shared memory regardless of how the changes to that shared state (originating from potentially unknown program contexts) are interleaved at run-time.

Claus Hunsen, Bo Zhang, Janet Siegmund, Christian Kästner, Olaf Lebenich, Martin Becker, Sven Apel.  2015.  Preprocessor-based variability in open-source and industrial software systems: An empirical study. Empirical Software Engineering. 20:1-34.

Almost every sufficiently complex software system today is configurable. Conditional compilation is a simple variability-implementation mechanism that is widely used in open-source projects and industry. Especially, the C preprocessor (CPP) is very popular in practice, but it is also gaining (again) interest in academia. Although there have been several attempts to understand and improve CPP, there is a lack of understanding of how it is used in open-source and industrial systems and whether different usage patterns have emerged. The background is that much research on configurable systems and product lines concentrates on open-source systems, simply because they are available for study in the first place. This leads to the potentially problematic situation that it is unclear whether the results obtained from these studies are transferable to industrial systems. We aim at lowering this gap by comparing the use of CPP in open-source projects and industry—especially from the embedded-systems domain—based on a substantial set of subject systems and well-known variability metrics, including size, scattering, and tangling metrics. A key result of our empirical study is that, regarding almost all aspects we studied, the analyzed open-source systems and the considered embedded systems from industry are similar regarding most metrics, including systems that have been developed in industry and made open source at some point. So, our study indicates that, regarding CPP as variability-implementation mechanism, insights, methods, and tools developed based on studies of open-source systems are transferable to industrial systems—at least, with respect to the metrics we considered.

Alireza Sadeghi, Naeem Esfahani, Sam Malek.  2014.  Mining the Categorized Software Repositories to Improve the Analysis of Security Vulnerabilities. Proceedings of the 17th International Conference on Fundamental Approaches to Software Engineering . 8411

Security has become the Achilles’ heel of most modern software systems. Techniques ranging from the manual inspection to automated static and dynamic analyses are commonly employed to identify security vulnerabilities prior to the release of the software. However, these techniques are time consuming and cannot keep up with the complexity of ever-growing software repositories (e.g., Google Play and Apple App Store). In this paper, we aim to improve the status quo and increase the efficiency of static analysis by mining relevant information from vulnerabilities found in the categorized software repositories. The approach relies on the fact that many modern software systems are developed using rich application development frameworks (ADF), allowing us to raise the level of abstraction for detecting vulnerabilities and thereby making it possible to classify the types of vulnerabilities that are encountered in a given category of application. We used open-source software repositories comprising more than 7 million lines of code to demonstrate how our approach can improve the efficiency of static analysis, and in turn, vulnerability detection.

Jeffrey Gennari, David Garlan.  2012.  Measuring Attack Surface in Software Architecture.

In this report we show how to adapt the notion of “attack surface” to formally evaluate security properties at the architectural level of design and to identify vulnerabilities in architectural designs. Further we explore the application of this metric in the context of architecture-based transformations to improve security by reducing the attack surface. These transformations are described in detail and validated with a simple experiment.

Luis Caires, Jorge Perez, Frank Pfenning, Bernardo Toninho.  2013.  Logic-Based Domain-Aware Session Types.

Software services and governing communication protocols are increasingly domain-aware. Domains can have multiple interpretations, such as the principals on whose behalf processes act or the location at which parties reside. Domains impact protocol compliance and access control, two central issues to overall functionality and correctness in distributed systems. This paper proposes a session-typed process framework for domain-aware communication-centric systems based on a CurryHoward interpretation of linear logic, here augmented with nominals from hybrid logic indicating domains. These nominals are explicit in the process expressions and govern domain migration, subject to a parametric accessibility relation familiar from the Kripke semantics for modal logic. Flexible access relationships among domains can be elegantly defined and statically enforced. The framework can also account for scenarios in which domain information is discovered only at runtime. Due to the logical origins of our systems, well-typed processes enjoy session fidelity, global progress, and termination. Moreover, well-typed processes always respect the accessibility relation and satisfy a form of domain parametricity, two properties crucial to show that domain-related properties of concrete programs are satisfied. 

Dennis Griffith, University of Illinois at Urbana-Champaign, Elsa Gunter, University of Illinois at Urbana-Champaign.  2013.  LiquidPi: Inferrable Dependent Session Types. 5th NASA Formal Methods Symposium NFM 2013 .

The Pi Calculus is a popular formalism for modeling distributed computation. Session Types extend the Pi Calculus with a static, inferable type system. Dependent Types allow for a more precise characterization of the behavior of programs, but in their full generality are not inferable. In this paper, we present LiquidPi an approach that combines the dependent type inferencing of Liquid Types with Honda’s Session Types to give a more precise automatically derived description of the behavior of distributed programs. These types can be used to describe/enforce safety properties of distributed systems. We present a type system parametric over an underlying functional language with Pi Calculus connectives and give an inference algorithm for it by means of efficient external solvers and a set of dependent qualifier templates.

Jorge Perez, Luis Caires, Frank Pfenning, Bernardo Toninho.  2014.  Linear Logical Relations and Observational Equivalences for Session-Based Concurrency. Elsevier. 239

We investigate strong normalization, confluence, and behavioral equality in the realm of session-based concurrency. These interrelated issues underpin advanced correctness analysis in models of structured communications. The starting point for our study is an interpretation of linear logic propositions as session types for communicating processes, proposed in prior work. Strong normalization and confluence are established by developing a theory of logical relations. Defined upon a linear type structure, our logical relations remain remarkably similar to those for functional languages. We also introduce a natural notion of observational equivalence for session-typed processes. Strong normalization and confluence come in handy in the associated coinductive reasoning: as applications, we prove that all proof conversions induced by the logic interpretation actually express observational equivalences, and explain how type isomorphismsresulting from linear logic equivalences are realized by coercions between interface types of session-based concurrent systems.

Luis Caires, Frank Pfenning, Bernardo Toninho.  2014.  Linear Logic Propositions as Session Types.

Throughout the years, several typing disciplines for the π-calculus have been proposed. Arguably, the most widespread of these typing disciplines consists of session types. Session types describe the input/output behavior of processes and traditionally provide strong guarantees about this behavior (i.e., deadlock freedom and fidelity). While these systems exploit a fundamental notion of linearity, the precise connection between linear logic and session types has not been well understood. This paper proposes a type system for the π-calculus that corresponds to a standard sequent calculus presentation of intuitionistic linear logic, interpreting linear propositions as session types and thus providing a purely logical account of all key features and properties of session types. We show the deep correspondence between linear logic and session types by exhibiting a tight operational correspondence between cut elimination steps and process reductions. We also discuss an alternative presentation of linear session types based on classical linear logic, and compare our development with other more traditional session type systems. 

Jonathan Aldrich, Cyrus Omar, Alex Potanin, Du Li.  2014.  Language-Based Architectural Control. Proceedings of the International Workshop on Aliasing, Capabilities and Ownership (IWACO), 2014.

Software architects design systems to achieve quality attributes like security, reliability, and performance. Key to achieving these quality attributes are design constraints governing how components of the system are configured, communicate and access resources. Unfortunately, identifying, specifying, communicating and enforcing important design constraints – achieving architectural control – can be difficult, particularly in large software systems. We argue for the development of architectural frameworks, built to leverage language mechanisms that provide for domain-specific syntax, editor services and explicit control over capabilities, that help increase architectural control. In particular, we argue for concise, centralized architectural descriptions which are responsible for specifying constraints and passing a minimal set of capabilities to downstream system components, or explicitly entrusting them to individuals playing defined roles within a team. By integrating these architectural descriptions directly into the language, the type system can help enforce technical constraints and editor services can help enforce social constraints. We sketch our approach in the context of distributed systems. 

Ben Blum.  2012.  Landslide: Systematic Exploration for Kernel-Space Race Detection. School of Computer Science. MS:88.

Systematic exploration is an approach to finding race conditions by deterministically executing every possible interleaving of thread transitions and identifying which ones expose bugs. Current systematic exploration techniques are suitable for testing user-space programs, but are inadequate for testing kernels, where the testing framework’s control over concurrency is more complicated. We present Landslide, a systematic exploration tool for finding races in kernels. Landslide targets Pebbles, the kernel specification that students implement in the undergraduate Operating Systems course at Carnegie Mellon University (15- 410). We discuss the techniques Landslide uses to address the general challenges of kernel-level concurrency, and we evaluate its effectiveness and usability as a debugging aid. We show that our techniques make systematic testing in kernel-space feasible and that Landslide is a useful tool for doing so in the context of 15-410.

Michael Maass, William Scherlis, Jonathan Aldrich.  2014.  In-Nimbo Sandboxing. HotSoS '14 Proceedings of the 2014 Symposium and Bootcamp on the Science of Security.

Sandboxes impose a security policy, isolating applications and their components from the rest of a system. While many sandboxing techniques exist, state of the art sandboxes generally perform their functions within the system that is being defended. As a result, when the sandbox fails or is bypassed, the security of the surrounding system can no longer be assured. We experiment with the idea of in-nimbo sandboxing, encapsulating untrusted computations away from the system we are trying to protect. The idea is to delegate computations that may be vulnerable or malicious to virtual machine instances in a cloud computing environment.

This may not reduce the possibility of an in-situ sandbox compromise, but it could significantly reduce the consequences should that possibility be realized. To achieve this advantage, there are additional requirements, including: (1) A regulated channel between the local and cloud environments that supports interaction with the encapsulated application, (2) Performance design that acceptably minimizes latencies in excess of the in-situ baseline.

To test the feasibility of the idea, we built an in-nimbo sandbox for Adobe Reader, an application that historically has been subject to significant attacks. We undertook a prototype deployment with PDF users in a large aerospace firm. In addition to thwarting several examples of existing PDF-based malware, we found that the added increment of latency, perhaps surprisingly, does not overly impair the user experience with respect to performance or usability.

Javier Camara, Antonia Lopes, David Garlan, Bradley Schmerl.  2014.  Impact Models for Architecture-Based Self-Adaptive Systems.

Self-adaptive systems have the ability to adapt their behavior to dynamic operation conditions. In reaction to changes in the environment, these systems determine the appropriate corrective actions based in part on information about which action will have the best impact on the system. Existing models used to describe the impact of adaptations are either unable to capture the underlying uncertainty and variability of such dynamic environments, or are not compositional and described at a level of abstraction too low to scale in terms of specification effort required for non-trivial systems. In this paper, we address these shortcomings by describing an approach to the specification of impact models based on architectural system descriptions, which at the same time allows us to represent both variability and uncertainty in the outcome of adaptations, hence improving the selection of the best corrective action. The core of our approach is an impact model language equipped with a formal semantics defined in terms of Discrete Time Markov Chains. To validate our approach, we show how employing our language can improve the accuracy of predictions used for decisionmaking in the Rainbow framework for architecture-based self-adaptation. 

Bernardo Toninho, Luis Caires, Frank Pfenning.  2013.  Higher-Order Processes, Functions, and Sessions: A monadic integration. ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems. :350-369.

In prior research we have developed a Curry-Howard interpretation of linear sequent calculus as session-typed processes. In this paper we uniformly integrate this computational interpretation in a functional language via a linear contextual monad that isolates session-based concurrency. Monadic values are open process expressions and are first class objects in the language, thus providing a logical foundation for higher-order session typed processes. We illustrate how the combined use of the monad and recursive types allows us to cleanly write a rich variety of concurrent programs, including higher-order programs that communicate processes. We show the standard metatheoretic result of type preservation, as well as a global progress theorem, which to the best of our knowledge, is new in the higher-order session typed setting.

Ghita Mezzour, L. Richard Carley, Kathleen Carley.  2014.  Global Mapping of Cyber Attacks.

Identifying factors behind countries’ weakness to cyber-attacks is an important step towards addressing these weaknesses at the root level.  For example, identifying factors why some countries become cyber- crime safe heavens can inform policy actions about how to reduce the attractiveness of these countries to cyber-criminals.  Currently, however, identifying these factors is mostly based on expert opinions and speculations.

In this work, we perform an empirical study to statistically test the validity of these opinions and specu- lations.  In our analysis, we use Symantec’s World Intelligence Network Environment (WINE) Intrusion Prevention System (IPS) telemetry data which contain attack reports from more than 10 million customer computers worldwide.  We use regression analysis to test for the relevance of multiple factors including monetary and computing resources, cyber-security research and institutions, and corruption.

Our analysis confirms some hypotheses and disproves others. We find that many countries in Eastern Europe extensively host attacking computers because of a combination of good computing infrastructure and high corruption rate.  We also find that web attacks and fake applications are most prevalent in rich countries because attacks on these countries are more lucrative. Finally, we find that computers in Africa launch the lowest rates of cyber-attacks. This is surprising given the bad cyber reputation of some African countries such as Nigeria. Our research has many policy implications.

Radu Vanciu, Marwan Abi-Antoun.  2013.  Extracting Dataflow Objects and other Flow Objects. Foundations of Object-Oriented Languages (FOOL) 2013.

Finding architectural flaws in object-oriented code requires a runtime architecture that shows multiple components of the same type that are used in different contexts. Previous work showed that a runtime architecture can be approximated by an abstract object graph that a static analysis extracts from code with Ownership Domain annotations. To find architectural flaws, it is not enough to reason about the presence or absence of communication. Additional work is needed to reason about the content of the communication. The contribution of this paper is a static analysis that extracts a hierarchical object graph with dataflow edges that refer to objects. The extraction analysis combines the aliasing precision provided by Ownership Domains with a domainsensitive value flow analysis. We evaluate the extraction analysis on an open-source Android application and discuss examples of dataflow edges that refer to objects that are in actual domains or to flow objects that are in domains corresponding to unique annotations.

Jiri Simsa, Randy Bryant, Garth Gibson, Jason Hickey.  2011.  Efficient Exploratory Testing of Concurrent Systems.

In our experience, exploratory testing has reached a level of maturity that makes it a practical and often the most cost-effective approach to testing. Notably, previous work has demonstrated that exploratory testing is capable of finding bugs even in well-tested systems [4, 17, 24, 23]. However, the number of bugs found gives little indication of the efficiency of a testing approach. To drive testing efficiency, this paper focuses on techniques for measuring and maximizing the coverage achieved by exploratory testing. In particular, this paper describes the design, implementation, and evaluation of Eta, a framework for exploratory testing of multithreaded components of a large-scale cluster management system at Google. For simple tests (with millions to billions of possible executions), Eta achieves complete coverage one to two orders of magnitude faster than random testing. For complex tests, Eta adopts a state space reduction technique to avoid the need to explore over 85% of executions and harnesses parallel processing to explore multiple test executions concurrently, achieving a throughput increase of up to 17.5×. 

Luis Caires, Jorge Perez, Frank Pfenning, Bernardo Toninho.  2013.  Behavioral Polymorphism and Parametricity in Session-Based Communication. European Symposium on Programming 2013. 7792:330-349.

We investigate a notion of behavioral genericity in the context of session type disciplines. To this end, we develop a logically motivated theory of parametric polymorphism, reminiscent of the Girard-Reynolds polymorphic λ-calculus, but casted in the setting of concurrent processes. In our theory, polymorphism accounts for the exchange of abstract communication protocols and dynamic instantiation of heterogeneous interfaces, as opposed to the exchange of data types and dynamic instantiation of individual message types. Our polymorphic session-typed process language satisfies strong forms of type preservation and global progress, is strongly normalizing, and enjoys a relational parametricity principle. Combined, our results confer strong correctness guarantees for communicating systems. In particular, parametricity is key to derive non-trivial results about internal protocol independence, a concurrent analogous of representation independence, and non-interference properties of modular, distributed systems.

Marwan Abi-Antoun, Sumukhi Chandrashekar, Radu Vanciu, Andrew Giang.  2014.  Are Object Graphs Extracted Using Abstract Interpretation Significantly Different from the Code? Extended Version SCAM '14 Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram that shows one box for a class to represent all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.

Arbob Ahmad, Robert Harper.  2015.  An Epistemic Formulation of Information Flow Analysis.

Most  accounts of information flow security in pro- gramming  languages emphasize non-interference  to characterize security: in a secure program,   changes to high-security  inputs do not alter the values  of low-security  outputs. The definition of non-interference   is incompatible  with declassification, which allows some low-security  outputs to be influenced by high-security inputs. We  propose  an alternative  account of information flow based on an epistemic logic of computational effects. Rather  than view a program  as a function from inputs to outputs, we instead embrace the principle that information flow security is concerned with the effects  a program has  on its execution  environment. These effects are modelled using a substructural  epistemic logic that tracks the flow of knowledge  gained by principals  and communication  channels during execution. Confidentiality   is expressed  by proving  necessary conditions  for a principal to know a sensitive fact at the end of an execution. In the simplest case  the necessary  condition   is  falsehood,   which means  that a principal cannot  know a secret  as  a result of a well-typed execution  of a program. In  the presence  of declassification  a necessary condition  for disclosure   is  the existence  of a proof of authorization in a formal authorization  logic, expressing that sensitive data is disclosed only when explicitly  authorized.  Rather than taken  as the primary result, the classical non-interference property  arises in the proof of adequacy of the epistemic theory of disclosure, ensuring that it accurately models program  behavior. It  is  suggested  that an epistemic  account  of information  flow security is both more natural  and more expressive than classical accounts based only on non-interference.

Erik Zawadzki, Geoffrey Gordon, Andre Platzer.  2013.  A projection algorithm for strictly monotone linear complementarity problems. Proceedings of NIPS OPT2013: Optimization for Machine Learning.

Complementary problems play a central role in equilibrium finding, physical sim- ulation, and optimization.  As a consequence, we are interested in understanding how to solve these problems quickly, and this often involves approximation.  In this paper we present a method for approximately solving strictly monotone linear complementarity problems with a Galerkin approximation.  We also give bounds for the approximate error, and prove novel bounds on perturbation error. These perturbation  bounds suggest that a Galerkin approximation  may be much less sen- sitive to noise than the original LCP.

Erik Zawadzki, Andre Platzer, Geoffrey Gordon.  2014.  A Generalization of SAT and #SAT for Robust Policy Evaluation.

Both SAT and #SAT can represent difficult problems in seemingly dissimilar areas such as planning, verification,  and probabilistic  inference. Here, we examine an expressive new language, #∃SAT, that generalizes both of these languages.   #∃SAT problems require counting the number of satisfiable formulas in a concisely-describable  set of existentially quantified, propositional formulas. We characterize the expressiveness and worst-case difficulty of #∃SAT by proving it is complete for the complexity  class #P NP [1], and re- lating this class to more familiar complexity  classes. We also experiment with three new

general-purpose #∃SAT solvers on a battery  of problem distributions  including  a simple logistics domain. Our experiments show that, despite the formidable worst-case complex-

ity of #P NP [1], many of the instances can be solved efficiently  by noticing and exploiting a particular type of frequent structure.

2016-04-25
Hemank Lamba, Thomas J. Glazier, Bradley Schmerl, Javier Camara, David Garlan, Jurgen Pfeffer.  2016.  A Model-based Approach to Anomaly Detection in Software Architectures. Symposium and Bootcamp on the Science of Security (HotSoS).

In an organization, the interactions users have with software leave patterns or traces of the parts of the systems accessed. These interactions can be associated with the underlying software architecture. The first step in detecting problems like insider threat is to detect those traces that are anomalous. Here, we propose a method to find anomalous users leveraging these interaction traces, categorized by user roles. We propose a model based approach to cluster user sequences and find outliers. We show that the approach works on a simulation of a large scale system based on and Amazon Web application style.

Marwan Abi-Antoun, Ebrahim Khalaj, Radu Vanciu, Ahmad Moghimi.  2016.  Abstract Runtime Structure Reasoning about Security. HotSos '16 Proceedings of the Symposium and Bootcamp on the Science of Security.

We propose an interactive approach where analysts reason about the security of a system using an abstraction of its runtime structure, as opposed to looking at the code. They interactively refine a hierarchical object graph, set security properties on abstract objects or edges, query the graph, and investigate the results by studying highlighted objects or edges or tracing to the code. Behind the scenes, an inference analysis and an extraction analysis maintain the soundness of the graph with respect to the code.

Eric Yuan, Sam Malek.  2016.  Mining Software Component Interactions to Detect Security Threats at the Architectural Level. 13th Working IEEE/IFIP Conference on Software Architecture (WICSA 2016).

Conventional security mechanisms at network, host, and source code levels are no longer sufficient in detecting and responding to increasingly dynamic and sophisticated cyber threats today. Detecting anomalous behavior at the architectural level can help better explain the intent of the threat and strengthen overall system security posture. To that end, we present a framework that mines software component interactions from system execution history and applies a detection algorithm to identify anomalous behavior. The framework uses unsupervised learning at runtime, can perform fast anomaly detection “on the fly”, and can quickly adapt to system load fluctuations and user behavior shifts. Our evaluation of the approach against a real Emergency Deployment System has demonstrated very promising results, showing the framework can effectively detect covert attacks, including insider threats, that may be easily missed by traditional intrusion detection methods. 

Bradley Schmerl, Jeffrey Gennari, Javier Camara, David Garlan.  2016.  Raindroid - A System for Run-time Mitigation of Android Intent Vulnerabilities. HotSos '16 Proceedings of the Symposium and Bootcamp on the Science of Security.

Modern frameworks are required to be extendable as well as secure. However, these two qualities are often at odds. In this poster we describe an approach that uses a combination of static analysis and run-time management, based on software architecture models, that can improve security while maintaining framework extendability. We implement a prototype of the approach for the Android platform. Static analysis identifies the architecture and communication patterns among the collection of apps on an Android device and which communications might be vulnerable to attack. Run-time mechanisms monitor these potentially vulnerable communication patterns, and adapt the system to either deny them, request explicit approval from the user, or allow them.