Biblio
Advanced persistent threats (APTs) are a particularly troubling challenge for software systems. The adversarial nature of the security domain, and APTs in particular, poses unresolved challenges to the design of self-* systems, such as how to defend against multiple types of attackers with different goals and capabilities. In this interaction, the observability of each side is an important and under-investigated issue in the self-* domain. We propose a model of APT defense that elevates observability as a first-class concern. We evaluate this model by showing how an informed approach that uses observability improves the defender's utility compared to a uniform random strategy, can enable robust planning through sensitivity analysis, and can inform observability-related architectural design decisions.
Large software systems have to contend with a significant number of users who interact with different components of the system in various ways. The sequences of components that are used as part of an interaction define sets of behaviors that users have with the system. These can be large in number. Among these users, it is possible that there are some who exhibit anomalous behaviors -- for example, they may have found back doors into the system and are doing something malicious. These anomalous behaviors can be hard to distinguish from normal behavior because of the number of interactions a system may have, or because traces may deviate only slightly from normal behavior. In this paper we describe a model-based approach to cluster sequences of user behaviors within a system and to find suspicious, or anomalous, sequences. We exploit the underlying software architecture of a system to define these sequences. We further show that our approach is better at detecting suspicious activities than other approaches, specifically those that use unigrams and bigrams for anomaly detection. We show this on a simulation of a large scale system based on Amazon Web application style architecture.
Security researchers do not have sufficient example systems for conducting research on advanced persistent threats, and companies and agencies that experience attacks in the wild are reluctant to release detailed information that can be examined. In this paper, we describe an Advanced Persistent Threat Exemplar that is intended to provide a real-world attack scenario with sufficient complexity for reasoning about defensive system adaptation, while not containing so much information as to be too complex. It draws from actual published attacks and experiences as a security engineer by the authors.
Software systems are increasingly called upon to autonomously manage their goals in changing contexts and environments, and under evolving requirements. In some circumstances, autonomous systems cannot be fully-automated but instead cooperate with human operators to maintain and adapt themselves. Furthermore, there are times when a choice should be made between doing a manual or automated repair. Involving operators in self-adaptation should itself be adaptive, and consider aspects such as the training, attention, and ability of operators. Not only do these aspects change from person to person, but they may change with the same person. These aspects make the choice of whether to involve humans non-obvious. Self-adaptive systems should trade-off whether to involve operators, taking these aspects into consideration along with other business qualities it is attempting to achieve. In this chapter, we identify the various roles that operators can perform in cooperating with self-adapting systems. We focus on humans as effectors-doing tasks which are difficult or infeasible to automate. We describe how we modified our self-adaptive framework, Rainbow, to involve operators in this way, which involved choosing suitable human models and integrating them into the existing utility trade-off decision models of Rainbow. We use probabilistic modeling and quantitative verification to analyze the trade-offs of involving humans in adaptation, and complement our study with experiments to show how different business preferences and modalities of human involvement may result in different outcomes.
Software architecture modeling is important for analyzing system quality attributes, particularly security. However, such analyses often assume that the architecture is completely known in advance. In many modern domains, especially those that use plugin-based frameworks, it is not possible to have such a complete model because the software system continuously changes. The Android mobile operating system is one such framework, where users can install and uninstall apps at run time. We need ways to model and analyze such architectures that strike a balance between supporting the dynamism of the underlying platforms and enabling analysis, particularly throughout a system’s lifetime. In this paper, we describe a formal architecture style that captures the modifiable architectures of Android systems, and that supports security analysis as a system evolves. We illustrate the use of the style with two security analyses: a predicatebased approach defined over architectural structure that can detect some common security vulnerabilities, and inter-app permission leakage determined by model checking. We also show how the evolving architecture of an Android device can be obtained by analysis of the apps on a device, and provide some performance evaluation that indicates that the architecture can be amenable for use throughout the system’s lifetime.
Modern frameworks are required to be extendable as well as secure. However, these two qualities are often at odds. In this poster we describe an approach that uses a combination of static analysis and run-time management, based on software architecture models, that can improve security while maintaining framework extendability. We implement a prototype of the approach for the Android platform. Static analysis identifies the architecture and communication patterns among the collection of apps on an Android device and which communications might be vulnerable to attack. Run-time mechanisms monitor these potentially vulnerable communication patterns, and adapt the system to either deny them, request explicit approval from the user, or allow them.
In an organization, the interactions users have with software leave patterns or traces of the parts of the systems accessed. These interactions can be associated with the underlying software architecture. The first step in detecting problems like insider threat is to detect those traces that are anomalous. Here, we propose a method to find anomalous users leveraging these interaction traces, categorized by user roles. We propose a model based approach to cluster user sequences and find outliers. We show that the approach works on a simulation of a large scale system based on and Amazon Web application style.
Modern frameworks are required to be extendable as well as secure. However, these two qualities are often at odds. In this poster we describe an approach that uses a combination of static analysis and run-time management, based on software architecture models, that can improve security while maintaining framework extendability.
Insider threats are a well-known problem, and previous studies have shown that it has a huge impact over a wide range of sectors like financial services, governments, critical infrastructure services and the telecommunications sector. Users, while interacting with any software system, leave a trace of what nodes they accessed and in what sequence. We propose to translate these sequences of observed activities into paths on the graph of the underlying software architectural model. We propose a clustering algorithm to find anomalies in the data, which can be combined with contextual information to confirm as an insider threat.
Self-adaptive systems overcome many of the limitations of human supervision in complex software-intensive systems by endowing them with the ability to automatically adapt their structure and behavior in the presence of runtime changes. However, adaptation in some classes of systems (e.g., safety-critical) can benefit by receiving information from humans (e.g., acting as sophisticated sensors, decision-makers), or by involving them as system-level effectors to execute adaptations (e.g., when automation is not possible, or as a fallback mechanism). However, human participants are influenced by factors external to the system (e.g., training level, fatigue) that affect the likelihood of success when they perform a task, its duration, or even if they are willing to perform it in the first place. Without careful consideration of these factors, it is unclear how to decide when to involve humans in adaptation, and in which way. In this paper, we investigate how the explicit modeling of human participants can provide a better insight into the trade-offs of involving humans in adaptation. We contribute a formal framework to reason about human involvement in self-adaptation, focusing on the role of human participants as actors (i.e., effectors) during the execution stage of adaptation. The approach consists of: (i) a language to express adaptation models that capture factors affecting human behavior and its interactions with the system, and (ii) a formalization of these adaptation models as stochastic multiplayer games (SMGs) that can be used to analyze human-system-environment interactions. We illustrate our approach in an adaptive industrial middleware used to monitor and manage sensor networks in renewable energy production plants.
Self-adaptive systems tend to be reactive and myopic, adapting in response to changes without anticipating what the subsequent adaptation needs will be. Adapting reactively can result in inefficiencies due to the system performing a suboptimal sequence of adaptations. Furthermore, when adaptations have latency, and take some time to produce their effect, they have to be started with sufficient lead time so that they complete by the time their effect is needed. Proactive latency-aware adaptation addresses these issues by making adaptation decisions with a look-ahead horizon and taking adaptation latency into account. In this paper we present an approach for proactive latency-aware adaptation under uncertainty that uses probabilistic model checking for adaptation decisions. The key idea is to use a formal model of the adaptive system in which the adaptation decision is left underspecified through nondeterminism, and have the model checker resolve the nondeterministic choices so that the accumulated utility over the horizon is maximized. The adaptation decision is optimal over the horizon, and takes into account the inherent uncertainty of the environment predictions needed for looking ahead. Our results show that the decision based on a look-ahead horizon, and the factoring of both tactic latency and environment uncertainty, considerably improve the effectiveness of adaptation decisions.
Modern software systems are often compositions of entities that increasingly use self-adaptive capabilities to improve their behavior to achieve systemic quality goals. Self adaptive managers for each component system attempt to provide locally optimal results, but if they cooperated and potentially coordinated their efforts it might be possible to obtain more globally optimal results. The emergent properties that result from such composition and cooperation of self-adaptive systems are not well understood, difficult to reason about, and present a key challenge in the evolution of modern software systems. For example, the effects of coordination patterns and protocols on emergent properties, such as the resiliency of the collectives, need to be understood when designing these systems. In this paper we propose that probabilistic model checking of stochastic multiplayer games (SMG) provides a promising approach to analyze, understand, and reason about emergent properties in collectives of adaptive systems (CAS). Probabilistic Model Checking of SMGs is a technique particularly suited to analyzing emergent properties in CAS since SMG models capture: (i) the uncertainty and variability intrinsic to a CAS and its execution environment in the form of probabilistic and nondeterministic choices, and (ii) the competitive/cooperative aspects of the interplay among the constituent systems of the CAS. Analysis of SMGs allows us to reason about things like the worst case scenarios, which constitutes a new contribution to understanding emergent properties in CAS. We investigate the use of SMGs to show how they can be useful in analyzing the impact of communication topology for collections of fully cooperative systems defending against an external attack.
Designing secure cyber-physical systems (CPS) is a particularly difficult task since security vulnerabilities stem not only from traditional cybersecurity concerns, but also physical ones. Many of the standard methods for CPS design make strong and unverified assumptions about the trustworthiness of physical devices, such as sensors. When these assumptions are violated, subtle inter-domain vulnerabilities are introduced into the system model. In this paper we use formal specification of analysis contracts to expose security assumptions and guarantees of analyses from reliability, control, and sensor security domains. We show that this specification allows us to determine where these assumptions are violated, opening the door to malicious attacks. We demonstrate how this approach can help discover and prevent vulnerabilities using a self-driving car example.
Designing secure cyber-physical systems (CPS) is a particularly difficult task since security vulnerabilities stem not only from traditional cybersecurity concerns, but also physical ones. Many of the standard methods for CPS design make strong and unverified assumptions about the trustworthiness of physical devices, such as sensors. When these assumptions are violated, subtle inter-domain vulnerabilities are introduced into the system model. In this paper we use formal specification of analysis contracts to expose security assumptions and guarantees of analyses from reliability, control, and sensor security domains. We show that this specification allows us to determine where these assumptions are violated, opening the door to malicious attacks. We demonstrate how this approach can help discover and prevent vulnerabilities using a self-driving car example.
Self-adaptive systems have the ability to adapt their behavior to dynamic operation conditions. In reaction to changes in the environment, these systems determine the appropriate corrective actions based in part on information about which action will have the best impact on the system. Existing models used to describe the impact of adaptations are either unable to capture the underlying uncertainty and variability of such dynamic environments, or are not compositional and described at a level of abstraction too low to scale in terms of specification effort required for non-trivial systems. In this paper, we address these shortcomings by describing an approach to the specification of impact models based on architectural system descriptions, which at the same time allows us to represent both variability and uncertainty in the outcome of adaptations, hence improving the selection of the best corrective action. The core of our approach is an impact model language equipped with a formal semantics defined in terms of Discrete Time Markov Chains. To validate our approach, we show how employing our language can improve the accuracy of predictions used for decisionmaking in the Rainbow framework for architecture-based self-adaptation.
Security features are often hardwired into software applications, making it difficult to adapt security responses to reflect changes in runtime context and new attacks. In prior work, we proposed the idea of architecture-based self-protection as a way of separating adaptation logic from application logic and providing a global perspective for reasoning about security adaptations in the context of other business goals. In this paper, we present an approach, based on this idea, for combating denial-of-service (DoS) attacks. Our approach allows DoS-related tactics to be composed into more sophisticated mitigation strategies that encapsulate possible responses to a security problem. Then, utility-based reasoning can be used to consider different business contexts and qualities. We describe how this approach forms the underpinnings of a scientific approach to self-protection, allowing us to reason about how to make the best choice of mitigation at runtime. Moreover, we also show how formal analysis can be used to determine whether the mitigations cover the range of conditions the system is likely to encounter, and the effect of mitigations on other quality attributes of the system. We evaluate the approach using the Rainbow self-adaptive framework and show how Rainbow chooses DoS mitigation tactics that are sensitive to different business contexts.
In many scientific fields, simulations and analyses require compositions of computational entities such as web-services, programs, and applications. In such fields, users may want various trade-offs between different qualities. Examples include: (i) performing a quick approximation vs. an accurate, but slower, experiment, (ii) using local slower execution environments vs. remote, but advanced, computing facilities, (iii) using quicker approximation algorithms vs. computationally expensive algorithms with smaller data. However, such trade-offs are difficult to make as many such decisions today are either (a) wired into a fixed configuration and cannot be changed, or (b) require detailed systems knowledge and experimentation to determine what configuration to use. In this paper we propose an approach that uses architectural models coupled with automated design space generation for making fidelity and timeliness trade-offs. We illustrate this approach through an example in the intelligence analysis domain.
As new market opportunities, technologies, platforms, and frameworks become available, systems require large-scale and systematic architectural restructuring to accommodate them. Today's architects have few techniques to help them plan this architecture evolution. In particular, they have little assistance in planning alternative evolution paths, trading off various aspects of the different paths, or knowing best practices for particular domains. In this paper, we describe an approach for planning and reasoning about architecture evolution. Our approach focuses on providing architects with the means to model prospective evolution paths and supporting analysis to select among these candidate paths. To demonstrate the usefulness of our approach, we show how it can be applied to an actual architecture evolution. In addition, we present some theoretical results about our evolution path constraint specification language.
Availability is an increasingly important quality for today's software-based systems and it has been successfully addressed by the use of closed-loop control systems in self-adaptive systems. Probes are inserted into a running system to obtain information and the information is fed to a controller that, through provided interfaces, acts on the system to alter its behavior. When a failure is detected, pinpointing the source of the failure is a critical step for a repair action. However, information obtained from a running system is commonly incomplete due to probing costs or unavailability of probes. In this paper we address the problem of fault localization in the presence of incomplete system monitoring. We may not be able to directly observe a component but we may be able to infer its health state. We provide formal criteria to determine when health states of unobservable components can be inferred and establish formal theoretical bounds for accuracy when using any spectrum-based fault localization algorithm.
Self-diagnosis is a fundamental capability of self-adaptive systems. In order to recover from faults, systems need to know which part is responsible for the incorrect behavior. In previous work we showed how to apply a design-time diagnosis technique at run time to identify faults at the architectural level of a system. Our contributions address three major shortcomings of our previous work: 1) we present an expressive, hierarchical language to describe system behavior that can be used to diagnose when a system is behaving different to expectation; the hierarchical language facilitates mapping low level system events to architecture level events; 2) we provide an automatic way to determine how much data to collect before an accurate diagnosis can be produced; and 3) we develop a technique that allows the detection of correlated faults between components. Our results are validated experimentally by injecting several failures in a system and accurately diagnosing them using our algorithm.
Since conventional software security approaches are often manually developed and statically deployed, they are no longer sufficient against today's sophisticated and evolving cyber security threats. This has motivated the development of self-protecting software that is capable of detecting security threats and mitigating them through runtime adaptation techniques. In this paper, we argue for an architecture-based self- protection (ABSP) approach to address this challenge. In ABSP, detection and mitigation of security threats are informed by an architectural representation of the running system, maintained at runtime. With this approach, it is possible to reason about the impact of a potential security breach on the system, assess the overall security posture of the system, and achieve defense in depth. To illustrate the effectiveness of this approach, we present several architecture adaptation patterns that provide reusable detection and mitigation strategies against well-known web application security threats. Finally, we describe our ongoing work in realizing these patterns on top of Rainbow, an existing architecture-based adaptation framework.
In this report we show how to adapt the notion of “attack surface” to formally evaluate security properties at the architectural level of design and to identify vulnerabilities in architectural designs. Further we explore the application of this metric in the context of architecture-based transformations to improve security by reducing the attack surface. These transformations are described in detail and validated with a simple experiment.
An important step in achieving robustness to run-time faults is the ability to detect and repair problems when they arise in a running system. Effective fault detection and repair could be greatly enhanced by run-time fault diagnosis and localization, since it would allow the repair mechanisms to focus adaptation effort on the parts most in need of attention. In this paper we describe an approach to run-time fault diagnosis that combines architectural models with spectrum-based reasoning for multiple fault localization. Spectrum-based reasoning is a lightweight technique that takes a form of trace abstraction and produces a list (ordered by probability) of likely fault candidates. We show how this technique can be combined with architectural models to support run-time diagnosis that can (a) scale to modern distributed software systems; (b) accommodate the use of black-box components and proprietary infrastructure for which one has neither a specification nor source code; and (c) handle inherent uncertainty about the probable cause of a problem even in the face of transient faults and faults that arise only when certain combinations of system components interact.