A Hybrid Game Theory and Reinforcement Learning Approach for Cyber-Physical Systems Security

Submitted by grigby1 on Tue, 02/16/2021 - 1:39pm

Title	A Hybrid Game Theory and Reinforcement Learning Approach for Cyber-Physical Systems Security
Publication Type	Conference Paper
Year of Publication	2020
Authors	Khoury, J., Nassar, M.
Conference Name	NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium
Keywords	CPS, CPS network, cps security, cyber-attacks, Cyber-physical systems, Cyber-physical systems security, Damage Assessment, game theory, hybrid game theory, ICs, invasive software, learning (artificial intelligence), malware author, Multi-Agent Reinforcement Learning, multi-agent systems, multiagent reinforcement learning, Nash equilibrium, pubcrawl, resilience, Resiliency, SCADA systems, Social Agents, supervisory control and data acquisition systems, Virus Spreading
Abstract	Cyber-Physical Systems (CPS) are monitored and controlled by Supervisory Control and Data Acquisition (SCADA) systems that use advanced computing, sensors, control systems, and communication networks. At first, CPS and SCADA systems were protected and secured by isolation. However, with recent industrial technology advances, the increased connectivity of CPSs and SCADA systems to enterprise networks has uncovered them to new cybersecurity threats and made them a primary target for cyber-attacks with the potential of causing catastrophic economic, social, and environmental damage. Recent research focuses on new methodologies for risk modeling and assessment using game theory and reinforcement learning. This paperwork proposes to frame CPS security on two different levels, strategic and battlefield, by meeting ideas from game theory and Multi-Agent Reinforcement Learning (MARL). The strategic level is modeled as imperfect information, extensive form game. Here, the human administrator and the malware author decide on the strategies of defense and attack, respectively. At the battlefield level, strategies are implemented by machine learning agents that derive optimal policies for run-time decisions. The outcomes of these policies manifest as the utility at a higher level, where we aim to reach a Nash Equilibrium (NE) in favor of the defender. We simulate the scenario of a virus spreading in the context of a CPS network. We present experiments using the MiniCPS simulator and the OpenAI Gym toolkit and discuss the results.
DOI	10.1109/NOMS47738.2020.9110453
Citation Key	khoury_hybrid_2020

Groups:

Science of Security VO