Biblio
We present an online framework for learning and updating security policies in dynamic IT environments. It includes three components: a digital twin of the target system, which continuously collects data and evaluates learned policies; a system identification process, which periodically estimates system models based on the collected data; and a policy learning process that is based on reinforcement learning. To evaluate our framework, we apply it to an intrusion prevention use case that involves a dynamic IT infrastructure. Our results demonstrate that the framework automatically adapts security policies to changes in the IT infrastructure and that it outperforms a state-of-the-art method.
Automatic optimal response systems are essential for preserving power system resilience and ensuring faster recovery from emergency under cyber compromise. Numerous research works have developed such response engine for cyber and physical system recovery separately. In this paper, we propose a novel cyber-physical decision support system, SCORE, that computes optimal actions considering pure and hybrid cyber-physical states, using Markov Decision Process (MDP). Such an automatic decision making engine can assist power system operators and network administrators to make a faster response to prevent cascading failures and attack escalation respectively. The hybrid nature of the engine makes the reward and state transition model of the MDP unique. Value iteration and policy iteration techniques are used to compute the optimal actions. Tests are performed on three and five substation power systems to recover from attacks that compromise relays to cause transmission line overflow. The paper also analyses the impact of reward and state transition model on computation. Corresponding results verify the efficacy of the proposed engine.