Multi-model Testbed for the Simulation-based Evaluation of Resilience (Oct '21)
PI(s), Co-PI(s), Researchers:
- Peter Volgyesi (PI)
- Himanshu Neema (Co-PI)
HARD PROBLEM(S) ADDRESSED
This refers to Hard Problems, released in November 2012.
- Security Metrics Driven Evaluation, Design, Development, and Deployment
- Resilient Architectures
The goal of the Multi-model Testbed is to provide a collaborative design tool for evaluating various cyber-attack / defense strategies and their effects on the physical infrastructure. The web-based, cloud-hosted environment integrates state-of-the-art simulation engines for the different CPS domains and presents interesting research challenges as ready-to-use scenarios. Input data, model parameters, and simulation results are archived, versioned with a strong emphasis on repeatability and provenance.
PUBLICATIONS
[1] Himanshu Neema, Leqiang Wang, Xenofon Koutsoukos, CheeYee Tang, and Keith Stouffer. 2021. Model-Based Risk Analysis Approach for Network Vulnerability and Security of the Critical Railway Infrastructure, in the 16th International Conference on Critical Information Infrastructures Security (CRITIS' 2021). Sep. 27-29, 2021, Lausanne, Switzerland.
[2] Himanshu Neema, Leqiang Wang, Xenofon Koutsoukos, CheeYee Tang, and Keith Stouffer. 2021. WiP: A Model-Based Risk Analysis Approach for Network Vulnerability of Railway Infrastructure. Presented the work-in-progress paper in the 8th Symposium on Hot Topics in the Science of Security (HotSoS '21). Association for Computing Machinery, Nashville, TN, USA. (presentation)
[3] A. I. Ozdagli, and X. Koutsoukos (2021). “Model-based Damage Detection through Physics-guided Learning for Dynamic Systems”, 13th Annual Conference of the Prognostics and Health Management Society. November 2021.
KEY HIGHLIGHTS
Threat Modeling and Risk Analysis in Industrial Control Systems
In this effort, we are working on developing a modeling and analysis framework for threats and cybersecurity risks in Industrial Control Systems (ICS). Identification of system vulnerabilities and implementation of appropriate risk mitigation strategies are crucial for ensuring the cybersecurity of Industrial Control Systems (ICS). These system vulnerabilities must be evaluated depending on their exploitability, impact, mitigation status, and target platform and environments. Therefore, in order to assess system vulnerabilities and risk mitigation strategies quantitatively, we are focusing on threat modeling and risk analysis methods for the cybersecurity of Railway Transportation Systems (RTS), which are real-world ICS and have become increasing vulnerable to cyber-attacks due to growing reliance on networked physical and computation components.
Another interesting aspects of RTS is that these systems have a continuously changing network topology due to moving locomotives. These systems, in general, are cyber-physical systems with integral but non-stationary components. The key challenge posed by non-stationarity is the evolving natural of threats and vulnerability propagation owing to dynamic network connections that form and disappear as components move!
Our framework dealing with this effort is called the Risk Analysis Framework (RAF). RAF has seven major components. The first component is modeling environment for system architecture where the ICS can be modeled with complete component hierarchy and the communication network topology. The second component allows for modeling cyber vulnerabilities, specifying attack ports and risk mitigation actions, and risk flows across components through attack ports. It also enables creating a library of cyber exploits and mitigations. The third component provides for validation of all models. The fourth component is for vulnerability assessment that propagates the risk with the system through network connections and hierarchy composition and generates the component attack trees and system attack graphs. It also rank orders the system vulnerabilities in order decreasing order of their impact on the overall system's cyber risk. The fifth component is for generation of code and artifacts from the risk assessments. The sixth component is a major tool for risk management planning which allows for cyber gaming various available risk mitigation actions against potential cyber exploits. The seventh component is for visualization of results and for analysis. We already visualize component attack trees and system attack trees. The work on visualization of risk management analysis is ongoing.
In our work, we have been successful in modeling the dynamic network connections and integrating it into dynamic vulnerability propagation algorithms. We previously show-cased our work at the HotSoS'21 symposium. Subsequently, we extended the framework to incorporate cyber-gaming of exploits versus mitigations to plan for worst-case attacks as well as developed methods to deal with dynamic network connections where the vulnerabilities and their propagation via changing network connectivity continually changes. We have published this work in the 16th International Conference on Critical Information Infrastructures Security (CRITIS' 2021) and presented it at the conference, where it was well-received.
We are working on further improving the methods and algorithms for dynamic risk management using cyber scenarios as well as on integrating this framework with our tools that enable integrated simulation based quantitative evaluations of cybersecurity of CPS.
Resilient Consensus using Centerpoint Algorithm and Hashgraph Blockchain Based Communication
We have started to work on resilient multi-agent system that integrates a centerpoint algorithm and Hashgraph technology to counteract byzantine and DDos attacks. The centerpoint is an aggregate function that is used to find a safe point in the convex hull of normal agent estimates, ensuring that we will find global convergence of the true objective with byzantine agents who are trying to shift the objective. The centerpoint algorithm is being developed for a three-dimensional space. Currently, we have developed a three-dimensional algorithm that utilizes a centerpoint algorithm to converge without byzantine attackers. We are looking to further improve this algorithm to work with byzantine agents.
In tandem with the centerpoint algorithm, we are developing a method for agents to exchange messages by communicating through the Hashgraph Blockchain. The Hashgraph Blockchain ensures that the network will still be stable with up to one third of the agents attacked. This resilient communication technique will ensure that the agents will have the correct data to aggregate with the centerpoint algorithm.
For testing, we are using Microsoft AirSim as a simulator for a multi-drone system. To emulate the drones, we have setup five Raspberry Pis with PX4 Flight Controller software-in-the-loop. PX4 acts as the drone, while the AirSim simulator will be running on another machine to visualize the movements of the drone. Also, each virtual machine will have an Hashgraph API installed to interact with the Hashgraph Blockchain client and send messages to other agents. The agents will utilize the Hashgraph client and the centerpoint algorithm to solve target pursuit and formation control problems during attacks.
Physics-guided Learning and Surrogate Modeling - Resilient CPS Applications
We continued our experimentation work for structural design and health monitoring for CPS applications [3]. In this work period, we developed two alternative methods for auto-generating FAE static stress simulation results with relatively simple parametric CAD models (pressure vessel capsules). The first approach uses the commercial ANSYS toolchain, while we also have an open-source alternative with FreeCAD (using the Calculix FEM tool).
The generated datasets are used for developing physics-guided ML models and are also used for experimenting with a graph machine learning-based approach for FEM surrogate modeling and/or topology optimization.
TECHNOLOGY TRANSFER and BROADER IMPACT
While the physics-guided learning approach has a broad use-case in CPS (e.g. buildings, transportation infrastructure) we successfully applied the results in the design process of unmanned underwater vehicles (UUV) as part of the DARPA Symbiotic program.
EDUCATIONAL ADVANCES and OUTREACH
Collaboration with Cybersecurity Research Group at Fujitsu System Integration Laboratories Ltd
Discussion topics:
Threat Intelligence - Sharing Policy Enforcement (SPE)
WebGME-based modeling tool development for SPE
Integration with the MITRE ATT@CK framework
EDUCATIONAL ADVANCES and OUTREACH
Collaboration with NIST on threat modeling and risk analysis in ICS
Discussion topics:
Threat modeling in Railway ICS
Risk Analysis
Quantitative Risk Evaluation
Integration with Simulation-Based Evaluation
VU/ISIS Summer Internship Seminar Series
As part of our summer internship program, we were presenting our ongoing research projects for undergraduate students. Peter Volgyesi (PI) gave a presentation and lead a discussion on AI-driven communication infrastructure CPS topics on May 26, 2021. The material covered important parts of the testbed (WebGME/DeepForge) with the goal of enticing prospective graduate students for the Lablet project.
Further, Dr. Himanshu Neema is currently advising an undergraduate student for his internship at our institute. Please note that these students are working with our technologies, but the internships are not funded by this project. The project of this internship is "Evaluation of Vector Control and Social Policies on Pathogen Spread within Communities." This project aims to utilize agent-based simulations for modeling arthropod behavior and human activities as well as social policies for vector control and for changing human behavior in order to evaluate how these affect the spread of pathogens in humans through mosquito bites. We plan to use integrated simulations for these evaluations. We have developed RESTful APIs for the creation, configuration, parameterization, execution, and control of the disease simulations. Currently, we are working on creating a model based experimentation environment using these REST APIs. The current work also involves developing a reinforcement learning algorithm for learning effective vector control policies within the constraints of the local county health department. Additionally, this work is being converted into a web-acessible design studio for other researchers to experment with the platform.