Biblio
It is more expensive and time consuming to build modern software without extensive supply chains. Supply chains decrease these development risks, but typically at the cost of increased security risk. In particular, it is often difficult to understand or verify what a software component delivered by a third party does or could do. Such a component could contain unwanted behaviors, vulnerabilities, or malicious code, many of which become incorporated in applications utilizing the component. Sandboxes provide relief by encapsulating a component and imposing a security policy on it. This limits the operations the component can perform without as much need to trust or verify the component. Instead, a component user must trust or verify the relatively simple sandbox. Given this appealing prospect, researchers have spent the last few decades developing new sandboxing techniques and sandboxes. However, while sandboxes have been adopted in practice, they are not as pervasive as they could be. Why are sandboxes not achieving ubiquity at the same rate as extensive supply chains? This thesis advances our understanding of and overcomes some barriers to sandbox adoption. We systematically analyze ten years (2004 – 2014) of sandboxing research from top-tier security and systems conferences. We uncover two barriers: (1) sandboxes are often validated using relatively subjective techniques and (2) usability for sandbox deployers is often ignored by the studied community. We then focus on the Java sandbox to empirically study its use within the open source community. We find features in the sandbox that benign applications do not use, which have promoted a thriving exploit landscape. We develop run time monitors for the Java Virtual Machine (JVM) to turn off these features, stopping all known sandbox escaping JVM exploits without breaking benign applications. Furthermore, we find that the sandbox contains a high degree of complexity benign applications need that hampers sandbox use. When studying the sandbox’s use, we did not find a single application that successfully deployed the sandbox for security purposes, which motivated us to overcome benignly-used complexity via tooling. We develop and evaluate a series of tools to automate the most complex tasks, which currently require error-prone manual effort. Our tools help users derive, express, and refine a security policy and impose it on targeted Java application JARs and classes. This tooling is evaluated through case studies with industrial collaborators where we sandbox components that were previously difficult to sandbox securely. Finally, we observe that design and implementation complexity causes sandbox developers to accidentally create vulnerable sandboxes. Thus, we develop and evaluate a sandboxing technique that leverages existing cloud computing environments to execute untrusted computations. Malicious outcomes produced by the computations are contained by ephemeral virtual machines. We describe a field trial using this technique with Adobe Reader and compare the new sandbox to existing sandboxes using a qualitative framework we developed.
Recent studies have found that parallel garbage collection performs worse with more CPUs and more collector threads. As part of this work, we further investigate this enomenon and find that poor scalability is worst in highly scalable Java applications. Our investigation to find the causes clearly reveals that efficient multi-threading in an application can prolong the average object lifespan, which results in less effective garbage collection. We also find that prolonging lifespan is the direct result of Linux's Completely Fair Scheduler due to its round-robin like behavior that can increase the heap contention between the application threads. Instead, if we use pseudo first-in-first-out to schedule application threads in large multicore systems, the garbage collection scalability is significantly improved while the time spent in garbage collection is reduced by as much as 21%. The average execution time of the 24 Java applications used in our study is also reduced by 11%. Based on this observation, we propose two approaches to optimally select scheduling policies based on application scalability profile. Our first approach uses the profile information from one execution to tune the subsequent executions. Our second approach dynamically collects profile information and performs policy selection during execution.
Software developers use #ifdef statements to support code configurability, allowing software product diversification. But because functions can be in many executions paths that depend on complex combinations of configuration options, the introduction of an #ifdef for a given purpose (such as adding a new feature to a program) can enable unintended function calls, which can be a source of vulnerabilities. Part of the difficulty lies in maintaining mental models of all dependencies. We propose analytic visualizations of thevariational callgraph to capture dependencies across configurations and create visualizations to demonstrate how it would help developers visually reason through the implications of diversification, for example through visually doing change impact analysis.
Today's social-coding tools foreshadow a transformation of the software industry, as it relies increasingly on open libraries, frameworks, and code fragments. Our vision calls for new intelligently transparent services that support rapid development of innovative products while helping developers manage risk and issuing them early warnings of looming failures. Intelligent transparency is enabled by an infrastructure that applies analytics to data from all phases of the life cycle of open source projects, from development to deployment. Such an infrastructure brings stakeholders the information they need when they need it.
Sociotechnical systems (STSs), where users interact with software components, support automated logging, i.e., what a user has performed in the system. However, most systems do not implement automated processes for inspecting the logs when a misuse happens. Deciding what needs to be logged is crucial as excessive amounts of logs might be overwhelming for human analysts to inspect. The goal of this research is to aid software practitioners to implement automated forensic logging by providing a systematic method of using attackers' malicious intentions to decide what needs to be logged. We propose Lokma: a normative framework to construct logging rules for forensic knowledge. We describe the general forensic process of Lokma, and discuss related directions.
We understand a socio-technical system (STS) as a cyber-physical system in which two or more autonomous parties interact via or about technical elements, including the parties’ resources and actions. As information technology begins to pervade every corner of human life, STSs are becoming ever more common, and the challenge of governing STSs is becoming increasingly important. We advocate a normative basis for governance, wherein norms represent the standards of correct behaviour that each party in an STS expects from others. A major benefit of focussing on norms is that they provide a socially realistic view of interaction among autonomous parties that abstracts low-level implementation details. Overlaid on norms is the notion of a sanction as a negative or positive reaction to potentially any violation of or compliance with an expectation. Although norms have been well studied as regards governance for STSs, sanctions have not. Our understanding and usage of norms is inadequate for the purposes of governance unless we incorporate a comprehensive representation of sanctions.
Norms provide a way to model the social architecture of a sociotechnical system (STS) and are thus crucial for understanding how such a system supports secure collaboration between principals,that is, autonomous parties such as humans and organizations. Accordingly, an important challenge is to compute the state of a norm instance at runtime in a sociotechnical system.
Custard addresses this challenge by providing a relational syntax for schemas of important norm types along with their canonical lifecycles and providing a mapping from each schema to queries that compute instances of the schema in different lifecycle stages. In essence, Custard supports a norm-based abstraction layer over underlying information stores such as databases and event logs. Specifically, it supports deadlines; complex events, including those based on aggregation; and norms that reference other norms.
We prove important correctness properties for Custard, including stability (once an event has occurred, it has occurred forever) and safety (a query returns a finite set of tuples). Our compiler generates SQL queries from Custard specifications. Writing out such SQL queries by hand is tedious and error-prone even for simple norms, thus demonstrating Custard's practical benefits.
Secure collaboration requires the collaborating parties to apply the
right policies for their interaction. We adopt a notion of
conditional, directed norms as a way to capture the standards of
correctness for a collaboration. How can we handle conflicting norms?
We describe an approach based on knowledge of what norm dominates what
norm in what situation. Our approach adapts answer-set programming to
compute stable sets of norms with respect to their computed conflicts
and dominance. It assesses agent compliance with respect to those
stable sets. We demonstrate our approach on a healthcare scenario.
The C preprocessor has received strong criticism in academia, among others regarding separation of concerns, error proneness, and code obfuscation, but is widely used in practice. Many (mostly academic) alternatives to the preprocessor exist, but have not been adopted in practice. Since developers continue to use the preprocessor despite all criticism and research, we ask how practitioners perceive the C preprocessor. We performed interviews with 40 developers, used grounded theory to analyze the data, and cross-validated the results with data from a survey among 202 developers, repository mining, and results from previous studies. In particular, we investigated four research questions related to why the preprocessor is still widely used in practice, common problems, alternatives, and the impact of undisciplined annotations. Our study shows that developers are aware of the criticism the C preprocessor receives, but use it nonetheless, mainly for portability and variability. Many developers indicate that they regularly face preprocessor-related problems and preprocessor-related bugs. The majority of our interviewees do not see any current C-native technologies that can entirely replace the C preprocessor. However, developers tend to mitigate problems with guidelines, but those guidelines are not enforced consistently. We report the key insights gained from our study and discuss implications for practitioners and researchers on how to better use the C preprocessor to minimize its negative impact.
Highly configurable systems allow users to tailor software to specific needs. Valid combinations of configuration options are often restricted by intricate constraints. Describing options and constraints in a variability model allows reasoning about the supported configurations. To automate creating and verifying such models, we need to identify the origin of such constraints. We propose a static analysis approach, based on two rules, to extract configuration constraints from code. We apply it on four highly configurable systems to evaluate the accuracy of our approach and to determine which constraints are recoverable from the code. We find that our approach is highly accurate (93% and 77% respectively) and that we can recover 28% of existing constraints. We complement our approach with a qualitative study to identify constraint sources, triangulating results from our automatic extraction, manual inspections, and interviews with 27 developers. We find that, apart from low-level implementation dependencies, configuration constraints enforce correct runtime behavior, improve users' configuration experience, and prevent corner cases. While the majority of constraints is extractable from code, our results indicate that creating a complete model requires further substantial domain knowledge and testing. Our results aim at supporting researchers and practitioners working on variability model engineering, evolution, and verification techniques.
Build systems contain a lot of configuration knowledge about a software system, such as under which conditions specific files are compiled. Extracting such configuration knowledge is important for many tools analyzing highly-configurable systems, but very challenging due to the complex nature of build systems. We design an approach, based on SYMake, that symbolically evaluates Makefiles and extracts configuration knowledge in terms of file presence conditions and conditional parameters. We implement an initial prototype and demonstrate feasibility on small examples.
Security has consistently been the focus of attention in many highly-configurable software systems. Several vulnerabilities on widely-used systems, such as the Linux kernel and OpenSSL, are reported every day in the National Vulnerability Database (NVD). The configurability of these systems enables the rapid generation of customized products, but also creates security challenges in the development and maintenance processes. For instance, interactions caused by configurations may create serious security threats and make generated products more susceptible to attacks [6], but the causes of these problems may be harder to detect because they occur only in specific configurations.
The number of trojans, worms, and viruses that computers encounter varies greatly across countries. Empirically identifying factors behind such variation can provide a scientific empirical basis to policy actions to reduce malware encounters in the most affected countries. However, our understanding of these factors is currently mainly based on expert opinions, not empirical evidence.
In this paper, we empirically test alternative hypotheses about factors behind international variation in the number of trojan, worm, and virus encounters. We use the Symantec Anti-Virus (AV) telemetry data collected from more than 10 million Symantec customer computers worldwide that we accessed through the Symantec Worldwide Intelligence Environment (WINE) platform. We use regression analysis to test for the effect of computing and monetary resources, web browsing behavior, computer piracy, cyber security expertise, and international relations on international variation in malware encounters.
We find that trojans, worms, and viruses are most prevalent in Sub-Saharan African countries. Many Asian countries also encounter substantial quantities of malware. Our regression analysis reveals that the main factor that explains high malware exposure of these countries is a widespread computer piracy especially when combined with poverty. Our regression analysis also reveals that, surprisingly, web browsing behavior, cyber security expertise, and international relations have no significant effect.
In today’s inter-connected world, threats from anywhere in the world can have serious global repercussions. In particular, two types of threats have a global impact: 1) cyber crime and 2) cyber and biological weapons. If a country’s environment is conducive to cyber criminal activities, cyber criminals will use that country as a basis to attack end-users around the world. Cyber weapons and biological weapons can now allow a small actor to inflict major damage on a major military power. If cyber and biological weapons are used in combination, the damage can be amplified significantly. Given that the cyber and biological threat is global, it is important to identify countries that pose the greatest threat and design action plans to reduce the threat from these countries. However, prior work on cyber crime lacks empirical substantiation for reasons why some countries’ environments are conducive to cyber crime. Prior work on cyber and biological weapon capabilities mainly consists of case studies which only focus on select countries and thus are not generalizeable. To sum up, assessing the global cyber and biological threat currently lacks a systematic empirical approach. In this thesis, I take an empirical and systematic approach towards assessing the global cyber and biological threat. The first part of the thesis focuses on cyber crime. I examine international variation in cyber crime infrastructure hosting and cyber crime exposure. I also empirically test hypotheses about factors behind such variation. In that work, I use Symantec’s telemetry data, collected from 10 million Symantec customer computers worldwide and accessed through the Symantec’s Worldwide Intelligence Network Environment (WINE). I find that addressing corruption in Eastern Europe or computer piracy in Sub-Saharan Africa has the potential to reduce the global cyber crime. The second part of the thesis focuses on cyber and biological weapon capabilities. I develop two computational methodologies: one to assess countries’ biological capabilities and one to assess countries’ cyber capabilities. The methodologies examine all countries in the world and can be used by non-experts that only have access to publicly available data. I validate the biological weapon assessment methodology by comparing the methodology’s assessment to historical data. This work has the potential to proactively reduce the global cyber and biological weapon threat.
Safety analysis is recognized as a fundamental problem in access control. It has been studied for various access control schemes in the literature. Recent work has proposed an administrative model for Temporal Role-Based Access Control (TRBAC) policies called Administrative TRBAC (ATRBAC). We address ATRBAC-safety. We first identify that the problem is PSPACE-Complete. This is a much tighter identification of the computational complexity of the problem than prior work, which shows only that the problem is decidable. With this result as the basis, we propose an approach that leverages an existing open-source software tool called Mohawk to address ATRBAC-safety. Our approach is to efficiently reduce ATRBAC-safety to ARBAC-safety, and then use Mohawk. We have conducted a thorough empirical assessment. In the course of our assessment, we came up with a "reduction toolkit," which allows us to reduce Mohawk+T input instances to instances that existing tools support. Our results suggest that there are some input classes for which Mohawk+T outperforms existing tools, and others for which existing tools outperform Mohawk+T. The source code for Mohawk+T is available for public download.
Security requirements analysis depends on how well-trained analysts perceive security risk, understand the impact of various vulnerabilities, and mitigate threats. When systems are composed of multiple machines, configurations, and software components that interact with each other, risk perception must account for the composition of security requirements. In this paper, we report on how changes to security requirements affect analysts risk perceptions and their decisions about how to modify the requirements to reach adequate security levels. We conducted two user surveys of 174 participants wherein participants assess security levels across 64 factorial vignettes. We analyzed the survey results using multi-level modeling to test for the effect of security requirements composition on participants’ overall security adequacy ratings and on their ratings of individual requirements. We accompanied this analysis with grounded analysis of elicited requirements aimed at lowering the security risk. Our results suggest that requirements composition affects experts’ adequacy ratings on security requirements. In addition, we identified three categories of requirements modifications, called refinements, replacements and reinforcements, and we measured how these categories compare with overall perceived security risk. Finally, we discuss the future impact of our work in security requirements assessment practice.
Mobile and web applications increasingly leverage service-oriented architectures in which developers integrate third-party services into end user applications. This includes identity management, mapping and navigation, cloud storage, and advertising services, among others. While service reuse reduces development time, it introduces new privacy and security risks due to data repurposing and over-collection as data is shared among multiple parties who lack transparency into third-party data practices. To address this challenge, we propose new techniques based on Description Logic (DL) for modeling multi-party data flow requirements and verifying the purpose specification and collection and use limitation principles, which are prominent privacy properties found in international standards and guidelines. We evaluate our techniques in an empirical case study that examines the data practices of the Waze mobile application and three of their service providers: Facebook Login, Amazon Web Services (a cloud storage provider), and Flurry.com (a popular mobile analytics and advertising platform). The study results include detected conflicts and violations of the principles as well as two patterns for balancing privacy and data use flexibility in requirements specifications. Analysis of automation reasoning over the DL models show that reasoning over complex compositions of multi-party systems is feasible within exponential asymptotic timeframes proportional to the policy size, the number of expressed data, and orthogonal to the number of conflicts found.
A self-managing software system should be able to monitor and analyze its runtime behavior and make adaptation decisions accordingly to meet certain desirable objectives. Traditional software adaptation techniques and recent “models@runtime” approaches usually require an a priori model for a system’s dynamic behavior. Oftentimes the model is difficult to define and labor-intensive to maintain, and tends to get out of date due to adaptation and architecture decay. We propose an alternative approach that does not require defining the system’s behavior model beforehand, but instead involves mining software component interactions from system execution traces to build a probabilistic usage model, which is in turn used to analyze, plan, and execute adaptations. In this article, we demonstrate how such an approach can be realized and effectively used to address a variety of adaptation concerns. In particular, we describe the details of one application of this approach for safely applying dynamic changes to a running software system without creating inconsistencies. We also provide an overview of two other applications of the approach, identifying potentially malicious (abnormal) behavior for self-protection, and improving deployment of software components in a distributed setting for performance self-optimization. Finally, we report on our experiments with engineering self-management features in an emergency deployment system using the proposed mining approach.
Sandboxes are increasingly important building materials for secure software systems. In recognition of their potential to improve the security posture of many systems at various points in the development lifecycle, researchers have spent the last several decades developing, improving, and evaluating sandboxing techniques. What has been done in this space? Where are the barriers to advancement? What are the gaps in these efforts? We systematically analyze a decade of sandbox research from five top-tier security and systems conferences using qualitative content analysis, statistical clustering, and graph-based metrics to answer these questions and more. We find that the term “sandbox” currently has no widely accepted or acceptable definition. We use our broad scope to propose the first concise and comprehensive definition for “sandbox” that consistently encompasses research sandboxes. We learn that the sandboxing landscape covers a range of deployment options and policy enforcement techniques collectively capable of defending diverse sets of components while mitigating a wide range of vulnerabilities. Researchers consistently make security, performance, and applicability claims about their sandboxes and tend to narrowly define the claims to ensure they can be evaluated. Those claims are validated using multi-faceted strategies spanning proof, analytical analysis, benchmark suites, case studies, and argumentation. However, we find two cases for improvement: (1) the arguments researchers present are often ad hoc and (2) sandbox usability is mostly uncharted territory. We propose ways to structure arguments to ensure they fully support their corresponding claims and suggest lightweight means of evaluating sandbox usability.
Designing secure cyber-physical systems (CPS) is a particularly difficult task since security vulnerabilities stem not only from traditional cybersecurity concerns, but also physical ones. Many of the standard methods for CPS design make strong and unverified assumptions about the trustworthiness of physical devices, such as sensors. When these assumptions are violated, subtle inter-domain vulnerabilities are introduced into the system model. In this paper we use formal specification of analysis contracts to expose security assumptions and guarantees of analyses from reliability, control, and sensor security domains. We show that this specification allows us to determine where these assumptions are violated, opening the door to malicious attacks. We demonstrate how this approach can help discover and prevent vulnerabilities using a self-driving car example.
Android is the most popular platform for mobile devices. It facilitates sharing of data and services among applications using a rich inter-app communication system. While access to resources can be controlled by the Android permission system, enforcing permissions is not sufficient to prevent security violations, as permissions may be mismanaged, intentionally or unintentionally. Android's enforcement of the permissions is at the level of individual apps, allowing multiple malicious apps to collude and combine their permissions or to trick vulnerable apps to perform actions on their behalf that are beyond their individual privileges. In this paper, we present COVERT, a tool for compositional analysis of Android inter-app vulnerabilities. COVERT's analysis is modular to enable incremental analysis of applications as they are installed, updated, and removed. It statically analyzes the reverse engineered source code of each individual app, and extracts relevant security specifications in a format suitable for formal verification. Given a collection of specifications extracted in this way, a formal analysis engine (e.g., model checker) is then used to verify whether it is safe for a combination of applications-holding certain permissions and potentially interacting with each other-to be installed together. Our experience with using COVERT to examine over 500 real-world apps corroborates its ability to find inter-app vulnerabilities in bundles of some of the most popular apps on the market.
Pervasiveness of smartphones and the vast number of corresponding apps have underlined the need for applicable automated software testing techniques. A wealth of research has been focused on either unit or GUI testing of smartphone apps, but little on automated support for end-to-end system testing. This paper presents SIG-Droid, a framework for system testing of Android apps, backed with automated program analysis to extract app models and symbolic execution of source code guided by such models for obtaining test inputs that ensure covering each reachable branch in the program. SIG-Droid leverages two automatically extracted models: Interface Model and Behavior Model. The Interface Model is used to find values that an app can receive through its interfaces. Those values are then exchanged with symbolic values to deal with constraints with the help of a symbolic execution engine. The Behavior Model is used to drive the apps for symbolic execution and generate sequences of events. We provide an efficient implementation of SIG-Droid based in part on Symbolic PathFinder, extended in this work to support automatic testing of Android apps. Our experiments show SIG-Droid is able to achieve significantly higher code coverage than existing automated testing tools targeted for Android.