Quickest Detection of Advanced Persistent Threats: A Semi-Markov Game Approach

Submitted by grigby1 on Fri, 01/22/2021 - 1:38pm

Title	Quickest Detection of Advanced Persistent Threats: A Semi-Markov Game Approach
Publication Type	Conference Paper
Year of Publication	2020
Authors	Sahabandu, D., Allen, J., Moothedath, S., Bushnell, L., Lee, W., Poovendran, R.
Conference Name	2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS)
Keywords	advanced persistent threat, advanced persistent threats, Analytical models, APT, Computational modeling, Cyber Attacks, cyberattack, DIFT-based defender, Dynamic Information Flow Tracking, game theory, Games, Human Behavior, Limiting average reward criteria, Markov processes, Metrics, multistage attacks, pubcrawl, q-learning, Rain, real-word attack dataset, refinable attack investigation framework, resilience, Resiliency, Scalability, security analysis, security of data, Stackelberg semiMarkov game, Stackelberg zero-sum semi-Markov games, threat detection, two-time scale Q-learning algorithm
Abstract	Advanced Persistent Threats (APTs) are stealthy, sophisticated, long-term, multi-stage attacks that threaten the security of sensitive information. Dynamic Information Flow Tracking (DIFT) has been proposed as a promising mechanism to detect and prevent various cyber attacks in computer systems. DIFT tracks suspicious information flows in the system and generates security analysis when anomalous behavior is detected. The number of information flows in a system is typically large and the amount of resources (such as memory, processing power and storage) required for analyzing different flows at different system locations varies. Hence, efficient use of resources is essential to maintain an acceptable level of system performance when using DIFT. On the other hand, the quickest detection of APTs is crucial as APTs are persistent and the damage caused to the system is more when the attacker spends more time in the system. We address the problem of detecting APTs and model the trade-off between resource efficiency and quickest detection of APTs. We propose a game model that captures the interaction of APT and a DIFT-based defender as a two-player, multi-stage, zero-sum, Stackelberg semi-Markov game. Our game considers the performance parameters such as false-negatives generated by DIFT and the time required for executing various operations in the system. We propose a two-time scale Q-learning algorithm that converges to a Stackelberg equilibrium under infinite horizon, limiting average payoff criteria. We validate our model and algorithm on a real-word attack dataset obtained using Refinable Attack INvestigation (RAIN) framework.
DOI	10.1109/ICCPS48487.2020.00009
Citation Key	sahabandu_quickest_2020

Groups:

Science of Security VO