Biblio
Engineering complex distributed systems is challenging. Recent solutions for the development of cyber-physical systems (CPS) in industry tend to rely on architectural designs based on service orientation, where the constituent components are deployed according to their service behavior and are to be understood as loosely coupled and mostly independent. In this paper, we develop a workflow that combines contract-based and CPS model-based specifications with service orientation, and analyze the resulting model using fault injection to assess the dependability of the systems. Compositionality principles based on the contract specification help us to make the analysis practical. The presented techniques are evaluated on two case studies.
The server is an important for storing data, collected during the diagnostics of Smart Business Center (SBC) as a subsystem of Industrial Internet of Things including sensors, network equipment, components for start and storage of monitoring programs and technical diagnostics. The server is exposed most often to various kind of attacks, in particular, aimed at processor, interface system, random access memory. The goal of the paper is analyzing the methods of the SBC server protection from malicious actions, as well as the development and investigation of the Markov model of the server's functioning in the SBC network, taking into account the impact of DDoS-attacks.
Developing software is hard. Developing software that is resilient and does not crash at the occurrence of unexpected inputs or events is even harder, especially with IoT devices and real-time requirements, e.g., due to interactions with human beings. Therefore, there is a need for a software architecture that helps software developers to build fault-tolerant software with as little pain and effort as possible. To this end, we have designed a fault tolerance framework for automation systems that lets developers be mostly oblivious to fault tolerance issues. Thus they can focus on the application logic encapsulated in (micro)services. That is, the developer only needs to specify the required fault tolerance level by description, not implementation. The fault tolerance aspects are transparent to the developer, as the framework takes care of them. This approach is particularly suited for the development for mixed-criticality systems, where different parts have very different and demanding functional and non-functional requirements. For such systems highly specialized developers are needed and removing the burden of fault tolerance results in faster time to market and safer and more dependable systems.
The Internet of things (IoT) is a distributed, networked system composed of many embedded sensor devices. Unfortunately, these devices are resource constrained and susceptible to malicious data-integrity attacks and failures, leading to unreliability and sometimes to major failure of parts of the entire system. Intrusion detection and failure handling are essential requirements for IoT security. Nevertheless, as far as we know, the area of data-integrity detection for IoT has yet to receive much attention. Most previous intrusion-detection methods proposed for IoT, particularly for wireless sensor networks (WSNs), focus only on specific types of network attacks. Moreover, these approaches usually rely on using precise values to specify abnormality thresholds. However, sensor readings are often imprecise and crisp threshold values are inappropriate. To guarantee a lightweight, dependable monitoring system, we propose a novel hierarchical framework for detecting abnormal nodes in WSNs. The proposed approach uses fuzzy logic in event-condition-action (ECA) rule-based WSNs to detect malicious nodes, while also considering failed nodes. The spatiotemporal semantics of heterogeneous sensor readings are considered in the decision process to distinguish malicious data from other anomalies. Following our experiments with the proposed framework, we stress the significance of considering the sensor correlations to achieve detection accuracy, which has been neglected in previous studies. Our experiments using real-world sensor data demonstrate that our approach can provide high detection accuracy with low false-alarm rates. We also show that our approach performs well when compared to two well-known classification algorithms.
Autonomous systems are gaining momentum in various application domains, such as autonomous vehicles, autonomous transport robotics and self-adaptation in smart homes. Product liability regulations impose high standards on manufacturers of such systems with respect to dependability (safety, security and privacy). Today's conventional engineering methods are not adequate for providing guarantees with respect to dependability requirements in a cost-efficient manner, e.g. road tests in the automotive industry sum up millions of miles before a system can be considered sufficiently safe. System engineers will no longer be able to test and respectively formally verify autonomous systems during development time in order to guarantee the dependability requirements in advance. In this vision paper, we introduce a new holistic software systems engineering approach for autonomous systems, which integrates development time methods as well as operation time techniques. With this approach, we aim to give the users a transparent view of the confidence level of the autonomous system under use with respect to the dependability requirements. We present already obtained results and point out research goals to be addressed in the future.
Cyber-attacks and intrusions in cyber-physical control systems are, currently, difficult to reliably prevent. Knowing a system's vulnerabilities and implementing static mitigations is not enough, since threats are advancing faster than the pace at which static cyber solutions can counteract. Accordingly, the practice of cybersecurity needs to ensure that intrusion and compromise do not result in system or environment damage or loss. In a previous paper [2], we described the Cyberspace Security Econometrics System (CSES), which is a stakeholder-aware and economics-based risk assessment method for cybersecurity. CSES allows an analyst to assess a system in terms of estimated loss resulting from security breakdowns. In this paper, we describe two new related contributions: 1) We map the Cyberspace Security Econometrics System (CSES) method to the evaluation and mitigation steps described by the NIST Guide to Industrial Control Systems (ICS) Security, Special Publication 800-82r2. Hence, presenting an economics-based and stakeholder-aware risk evaluation method for the implementation of the NIST-SP-800-82 guide; and 2) We describe the application of this tailored method through the use of a fictitious example of a critical infrastructure system of an electric and gas utility.
The performance, dependability, and security of cloud service systems are vital for the ongoing operation, control, and support. Thus, controlled improvement in service requires a comprehensive analysis and systematic identification of the fundamental underlying constituents of cloud using a rigorous discipline. In this paper, we introduce a framework which helps identifying areas for potential cloud service enhancements. A cloud service cannot be completed if there is a failure in any of its underlying resources. In addition, resources are kept offline for scheduled maintenance. We use redundant resources to mitigate the impact of failures/maintenance for ensuring performance and dependability; which helps enhancing security as well. For example, at least 4 replicas are required to defend the intrusion of a single instance or a single malicious attack/fault as defined by Byzantine Fault Tolerance (BFT). Data centers with high performance, dependability, and security are outsourced to the cloud computing environment with greater flexibility of cost of owing the computing infrastructure. In this paper, we analyze the effectiveness of redundant resource usage in terms of dependability metric and cost of service deployment based on the priority of service requests. The trade-off among dependability, cost, and security under different redundancy schemes are characterized through the comprehensive analytical models.
Demand ResponseManagement (DRM) is a key component in the smart grid to effectively reduce power generation costs and user bills. However, it has been an open issue to address the DRM problem in a network of multiple utility companies and consumers where every entity is concerned about maximizing its own benefit. In this paper, we propose a Stackelberg game between utility companies and end-users to maximize the revenue of each utility company and the payoff of each user. We derive analytical results for the Stackelberg equilibrium of the game and prove that a unique solution exists.We develop a distributed algorithm which converges to the equilibrium with only local information available for both utility companies and end-users. Though DRM helps to facilitate the reliability of power supply, the smart grid can be succeptible to privacy and security issues because of communication links between the utility companies and the consumers. We study the impact of an attacker who can manipulate the price information from the utility companies.We also propose a scheme based on the concept of shared reserve power to improve the grid reliability and ensure its dependability.
We propose a resilience architecture for improving the security and dependability of authentication and authorization infrastructures, in particular the ones based on RADIUS and OpenID. This architecture employs intrusion-tolerant replication, trusted components and entrusted gateways to provide survivable services ensuring compatibility with standard protocols. The architecture was instantiated in two prototypes, one implementing RADIUS and another implementing OpenID. These prototypes were evaluated in fault-free executions, under faults, under attack, and in diverse computing environments. The results show that, beyond being more secure and dependable, our prototypes are capable of achieving the performance requirements of enterprise environments, such as IT infrastructures with more than 400k users.
Internet-scale software becomes more and more important as a mode to construct software systems when Internet is developing rapidly. Internet-scale software comprises a set of widely distributed software entities which are running in open, dynamic and uncontrollable Internet environment. There are several aspects impacting dependability of Internet-scale software, such as technical, organizational, decisional and human aspects. It is very important to evaluate dependability of Internet-scale software by integrating all the aspects and analyzing system architecture from the most foundational elements. However, it is lack of such an evaluation model. An evaluation model of dependability for Internet-scale software on the basis of Bayesian Networks is proposed in this paper. The structure of Internet-scale software is analyzed. An evaluating system of dependability for Internet-scale software is established. It includes static metrics, dynamic metrics, prior metrics and correction metrics. A process of trust attenuation based on assessment is proposed to integrate subjective trust factors and objective dependability factors which impact on system quality. In this paper, a Bayesian Network is build according to the structure analysis. A bottom-up method that use Bayesian reasoning to analyses and calculate entity dependability and integration dependability layer by layer is described. A unified dependability of the whole system is worked out and is corrected by objective data. The analysis of experiment in a real system proves that the model in this paper is capable of evaluating the dependability of Internet-scale software clearly and objectively. Moreover, it offers effective help to the design, development, deployment and assessment of Internet-scale software.
Byzantine fault tolerance has been intensively studied over the past decade as a way to enhance the intrusion resilience of computer systems. However, state-machine-based Byzantine fault tolerance algorithms require deterministic application processing and sequential execution of totally ordered requests. One way of increasing the practicality of Byzantine fault tolerance is to exploit the application semantics, which we refer to as application-aware Byzantine fault tolerance. Application-aware Byzantine fault tolerance makes it possible to facilitate concurrent processing of requests, to minimize the use of Byzantine agreement, and to identify and control replica nondeterminism. In this paper, we provide an overview of recent works on application-aware Byzantine fault tolerance techniques. We elaborate the need for exploiting application semantics for Byzantine fault tolerance and the benefits of doing so, provide a classification of various approaches to application-aware Byzantine fault tolerance, and outline the mechanisms used in achieving application-aware Byzantine fault tolerance according to our classification.
Byzantine fault tolerance has been intensively studied over the past decade as a way to enhance the intrusion resilience of computer systems. However, state-machine-based Byzantine fault tolerance algorithms require deterministic application processing and sequential execution of totally ordered requests. One way of increasing the practicality of Byzantine fault tolerance is to exploit the application semantics, which we refer to as application-aware Byzantine fault tolerance. Application-aware Byzantine fault tolerance makes it possible to facilitate concurrent processing of requests, to minimize the use of Byzantine agreement, and to identify and control replica nondeterminism. In this paper, we provide an overview of recent works on application-aware Byzantine fault tolerance techniques. We elaborate the need for exploiting application semantics for Byzantine fault tolerance and the benefits of doing so, provide a classification of various approaches to application-aware Byzantine fault tolerance, and outline the mechanisms used in achieving application-aware Byzantine fault tolerance according to our classification.