Multi-model Testbed for the Simulation-based Evaluation of Resilience (Jan '23)
PI(s), Co-PI(s), Researchers:
- Peter Volgyesi (PI)
- Himanshu Neema (Co-PI)
HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released in November 2012.
- Security Metrics Driven Evaluation, Design, Development, and Deployment
- Resilient Architectures
The goal of the Multi-model Testbed is to provide a collaborative design tool for evaluating various cyber-attack / defense strategies and their effects on the physical infrastructure. The web-based, cloud-hosted environment integrates state-of-the-art simulation engines for the different CPS domains and presents interesting research challenges as ready-to-use scenarios. Input data, model parameters, and simulation results are archived, versioned with a strong emphasis on repeatability and provenance.
PUBLICATIONS
[1] Himanshu Neema, Thomas Roth, Chenli Wang, Wenqi Guo and Anirban Bhattacharjee. 2022. "Integrating Multiple HLA Federations for Effective Simulation-Based Evaluations of CPS", 4th International Workshop on Design Automation for CPS and IoT (DESTION 2022), Milan, Italy. May 2022.
[Abstract] Cyber-Physical Systems (CPS) are complex systems of computational, physical, and human components integrated to achieve some function over one or more networks. The use of distributed simulation, or co-simulation, is one method often used to analyze the behavior and properties of these systems. High-Level Architecture (HLA) is an IEEE co-simulation standard that supports the development and orchestration of distributed simulations. However, a simple HLA federation constructed with the component simulations (i.e., federates) does not satisfy several requirements that arise in real-world use cases such as the shared use of limited physical and computational resources, the need to selectively hide information from participating federates, the creation of reusable federates and federations for supporting configurable shared services, achieving performant distributed simulations, organizing federations across different model types or application concerns, and coordinating federations across organizations with different information technology policies. This paper describes these core requirements that necessitate the use of multiple HLA federations and presents various mechanisms for constructing such integrated HLA federations. An example use case is implemented using a model-based rapid simulation integration framework called the Universal CPS Environment for Federation (UCEF) to illustrate these requirements and demonstrate techniques for integrating multiple HLA federations.
[2] M. Maroti, W. Hedgecock, P.Volgyesi. “Rapid Design Space Exploration with Constraint Programming”, The 4th Workshop on Design Automation for CPS and IoT (DESTION). May 2022.
[Abstract]
Sample-efficient design space exploration (DSE) of complex CPS architectures remains a key challenge for identifying optimal configurations of components, design parameters and architectural choices. Detailed executable models require significant investment to build and are typically slow to evaluate. On the other hand, high-level conceptual models may lack the exactness or accuracy to evaluate and compare. In this paper we propose a constraint-based approach for capturing the design space and a vectorized, iterative solver for rapidly discovering Pareto-optimal design points. The paper describes the constraint-based modeling approach and developed tools through a concrete design optimization problem of unmanned underwater vehicles.
[3] Meiyi Ma, Himanshu Neema, and Janos Sztipanovits. 2022. "Recovery Planning using Simulation-Based Predictive Monitoring". Chapter accepted in Springer book by Alexander Knott at Army Research Laboratory, titled, "Autonomous Intelligent Cyber-Defense Agents", Springer, 2022. (Accepted).
[Abstract] Despite the rapid development of cybersecurity, recovery of the operation of the impacted cyber-physical system (CPS) after a cyber-attack, as a core element of cyber resilience, is often left to human decision-makers. There is a high demand for an autonomous intelligent cyber defense agent (AICA) for planning a rapid recovery. In this chapter, we introduce and demonstrate a system for recovery planning using simulation-based predictive monitoring to recover the system from attacks (cyber, physical, or hardware) and disruptions automatically. The recovery planning system first evaluates the impact of system degradation and generates courses of actions (COAs) for recovery efficiently. Then, it evaluates these COAs through integrated heterogeneous simulations that accounts for unavoidable uncertainty. By formalizing security and safety requirements, it formally verifies recovery COAs with confidence guarantees, and obtains the optimal recovery COAs. We present two recovery scenarios in smart cities to demonstrate the effectiveness of our recovery planning system.
KEY HIGHLIGHTS
Threat Modeling and Risk Analysis in Industrial Control Systems
In this effort, we are working on developing a modeling and analysis framework for threats and cybersecurity risks in Industrial Control Systems (ICS). Identification of system vulnerabilities and implementation of appropriate risk mitigation strategies are crucial for ensuring the cybersecurity of Industrial Control Systems (ICS). These system vulnerabilities must be evaluated depending on their exploitability, impact, mitigation status, and target platform and environments. Therefore, in order to assess system vulnerabilities and risk mitigation strategies quantitatively, we are focusing on threat modeling and risk analysis methods for the cybersecurity of Railway Transportation Systems (RTS), which are real-world ICS and have become increasing vulnerable to cyber-attacks due to growing reliance on networked physical and computation components. Another interesting aspects of RTS is that these systems have a continuously changing network topology due to moving locomotives. These systems, in general, are cyber-physical systems with integral but non-stationary components. The key challenge posed by non-stationarity is the evolving natural of threats and vulnerability propagation owing to dynamic network connections that form and disappear as components move!
Our framework dealing with this effort is called the Risk Analysis Framework (RAF). RAF has seven major components. The first component is modeling environment for system architecture where the ICS can be modeled with complete component hierarchy and the communication network topology. The second component allows for modeling cyber vulnerabilities, specifying attack ports and risk mitigation actions, and risk flows across components through attack ports. It also enables creating a library of cyber exploits and mitigations. The third component provides for validation of all models. The fourth component is for vulnerability assessment that propagates the risk with the system through network connections and hierarchy composition and generates the component attack trees and system attack graphs. It also rank orders the system vulnerabilities in order decreasing order of their impact on the overall system's cyber risk. The fifth component is for generation of code and artifacts from the risk assessments. The sixth component is a major tool for risk management planning which allows for cyber gaming various available risk mitigation actions against potential cyber exploits. The seventh component is for visualization of results and for analysis. We already visualize component attack trees and system attack trees. The work on visualization of risk management analysis is ongoing.
In our work, we have been successful in modeling the dynamic network connections and integrating it into dynamic vulnerability propagation algorithms. We previously show-cased our work at the HotSoS'21 symposium. Subsequently, we extended the framework to incorporate cyber-gaming of exploits versus mitigations to plan for worst-case attacks as well as developed methods to deal with dynamic network connections where the vulnerabilities and their propagation via changing network connectivity continually changes. We have published this work in the 16th International Conference on Critical Information Infrastructures Security (CRITIS' 2021) and presented it at the conference, where it was well-received.
We worked further on improving the methods and algorithms for dynamic risk management using cyber scenarios as well as on integrating this framework with our tools that enable integrated simulation based quantitative evaluations of cybersecurity of CPS. We have designed new methods to connect the RAF framework with the integrated simulation testbed. In one of the demonstrated scenarios, we utilize the vulnerability scores of system components in RAF to design cybersecurity assessment scenarios in the integrated simulation environment. In addition, we analyze the security mechanisms in the simulation environment against the cyber threats that mimic the vulnerabilities modeled in RAF and develop the mitigation scores corresponding to these security mechanism. The mitigation scores are then fed back to RAF to update the models and recalculate update risk profile of the system. We believe that this bidirectional link between the risk analysis framework and the integrated simulation testbed is a powerful new method for more accurately assessing the cybersecurity risks of CPS.
General-Purpose ML Attack Library
Based on our previous work in CPSWT framework on general-purpose cyber-attack library and its use in resilience evaluation using courses-of-action, we started investigating the idea of creating a general-purpose ML attack library. The idea is that these ML attacks will be designed to be generic and can be quickly adapted to attack and test resilience of different ML models flexibly with simple configurations for customization. This work is in initial stages and we are targeting to use deepforge platform for developing these configurable, reusable ML attack library. The deepforge platform uses WebGME as the metamodeling environment and supports Keras ML library for developing ML pipelines. We will use the same platform for developing the ML attack library.
Resilient Consensus using Centerpoint Algorithm and Hashgraph Blockchain Based Communication
We are finalizing our work towards this project and directing our efforts towards future work. We have obtained results and finished the project for the power testbed, demonstrating it with the integration of Hashgraph and Centerpoint. Currently, we are focused on finalizing work with the Multi-UAV system. Furthermore, we have continued to design analytics for our resilient Multi-UAV target pursuit to evaluate for consistency of data distributed to each agent and that each agent has a consistent global view of the target. We evaluated these analytics in both non-adversarial and attack scenarios and concluded that the system maintains a global consistent view and converges to the target.
EDUCATIONAL ADVANCES and OUTREACH
Presentations and discussion with National Security Staff for TN Sen. Marsha Blackburn & TN Sen. Bill Hagerty
June 28, 2022
Discussion topics:
Research topics of Wireless/RF security and resillience
Collaboration with NIST on threat modeling and risk analysis in ICS
Discussion topics:
Threat modeling in Railway ICS
Risk Analysis
Quantitative Risk Evaluation
Integration with Simulation-Based Evaluation
Cybersurity Risks of Water-Distribution Networks
VU/ISIS Summer Internship Seminar Series
Dr. Himanshu Neema is currently advising an undergraduate student for his internship at our institute. Please note that these students are working with our technologies, but the internships are not funded by this project. The project of this internship is "Evaluation of Vector Control and Social Policies on Pathogen Spread within Communities." This project aims to utilize agent-based simulations for modeling arthropod behavior and human activities as well as social policies for vector control and for changing human behavior in order to evaluate how these affect the spread of pathogens in humans through mosquito bites. We plan to use integrated simulations for these evaluations. We have developed RESTful APIs for the creation, configuration, parameterization, execution, and control of the disease simulations. Currently, we are working on creating a model based experimentation environment using these REST APIs. The current work also involves developing a reinforcement learning algorithm for learning effective vector control policies within the constraints of the local county health department. Additionally, this work is being converted into a web-acessible design studio for other researchers to experment with the platform. More recently, we have integrated a modeling environment and necessary tooling for "Common Workflow Language" which supports complex workflow designs where different compute steps within the workflows can be either a C/C++/Python script or even a full-fledged containerized docker execution. This enables us to package the epidemiological simulations (currently supported simulators are EMOD and MicroMoB) into docker images and use them as one of the compute step within the workflows.
Dr. Himanshu Neema is also advising one other undergraduate students for summer internship in the Heterogeneous Simulation Integration project. The aim of the project is to provide a framework for model-based rapid synthesis of large-scale distributed and integrated heterogeneous simulations. These systems (e.g., critical infrastructures such as the transportation networks, electricity networks, or the water distribution networks, or even large command-and-control organizations such as in the military or the air-force) have many different subsystems, which themselves are quite complex. Thus, each of these subsystems require their own specific simulation tools to model and analyze them. For evaluating the large systems (or as we call them 'system-of-systems'), evaluating their different parts in isolation is not sufficient. What is really needed is to integrate different simulators in a logically and temporally coherent manner so that they work together and provide us with the mechanisms to evaluate these large systems as a whole. In this project, in the last many years, we have developed a model-based framework that allows to model these large systems-of-systems and its code-generation tools automatically synthesize the integrated system-of-systems simulations. When executed, the integrated simulations all run concurrently, are time synchronized, and exchange data corresponding to their system-level interdependence. These different simulators are highly heterogeneous in nature as they use different modeling languages, represent different real-world systems, have different models of computation, and are written in different programming languages. Hence, wiring them together in a consistent manner is extremely challenging. As part of the internship in this project, the students are learning about the IEEE High-Level Architecture (HLA), HLA Run-Time Infrastructure (RTI), Model-Integrated Computing (MIC), Scenario-Based Experimentation, Unit Testing, etc.