Visible to the public Defense Against Advanced Persistent Threats: Optimal Network Security Hardening Using Multi-stage Maze Network Game

TitleDefense Against Advanced Persistent Threats: Optimal Network Security Hardening Using Multi-stage Maze Network Game
Publication TypeConference Paper
Year of Publication2020
AuthorsZhang, H., Liu, H., Liang, J., Li, T., Geng, L., Liu, Y., Chen, S.
Conference Name2020 IEEE Symposium on Computers and Communications (ISCC)
Keywordsadvanced persistent threat, advanced persistent threats, APT, Attack Graphs, continuous method, game model, game theoretic security, game theory, game-theoretic framework, Human Behavior, human factors, learning (artificial intelligence), Markov processes, Metrics, Multistage Maze Network Game, multistage plan, Nash equilibrium, network attacks, optimal network security, policy hill-climbing, policy hill-climbing algorithm, Predictive Metrics, pubcrawl, reinforcement learning (RL), resilience, Resiliency, Scalability, security of data, serious privacy leakage, Stackelberg games, stealthy method
Abstract

Advanced Persistent Threat (APT) is a stealthy, continuous and sophisticated method of network attacks, which can cause serious privacy leakage and millions of dollars losses. In this paper, we introduce a new game-theoretic framework of the interaction between a defender who uses limited Security Resources(SRs) to harden network and an attacker who adopts a multi-stage plan to attack the network. The game model is derived from Stackelberg games called a Multi-stage Maze Network Game (M2NG) in which the characteristics of APT are fully considered. The possible plans of the attacker are compactly represented using attack graphs(AGs), but the compact representation of the attacker's strategies presents a computational challenge and reaching the Nash Equilibrium(NE) is NP-hard. We present a method that first translates AGs into Markov Decision Process(MDP) and then achieves the optimal SRs allocation using the policy hill-climbing(PHC) algorithm. Finally, we present an empirical evaluation of the model and analyze the scalability and sensitivity of the algorithm. Simulation results exhibit that our proposed reinforcement learning-based SRs allocation is feasible and efficient.

DOI10.1109/ISCC50000.2020.9219722
Citation Keyzhang_defense_2020