Defense Against Advanced Persistent Threats: Optimal Network Security Hardening Using Multi-stage Maze Network Game
Title | Defense Against Advanced Persistent Threats: Optimal Network Security Hardening Using Multi-stage Maze Network Game |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Zhang, H., Liu, H., Liang, J., Li, T., Geng, L., Liu, Y., Chen, S. |
Conference Name | 2020 IEEE Symposium on Computers and Communications (ISCC) |
Keywords | advanced persistent threat, advanced persistent threats, APT, Attack Graphs, continuous method, game model, game theoretic security, game theory, game-theoretic framework, Human Behavior, human factors, learning (artificial intelligence), Markov processes, Metrics, Multistage Maze Network Game, multistage plan, Nash equilibrium, network attacks, optimal network security, policy hill-climbing, policy hill-climbing algorithm, Predictive Metrics, pubcrawl, reinforcement learning (RL), resilience, Resiliency, Scalability, security of data, serious privacy leakage, Stackelberg games, stealthy method |
Abstract | Advanced Persistent Threat (APT) is a stealthy, continuous and sophisticated method of network attacks, which can cause serious privacy leakage and millions of dollars losses. In this paper, we introduce a new game-theoretic framework of the interaction between a defender who uses limited Security Resources(SRs) to harden network and an attacker who adopts a multi-stage plan to attack the network. The game model is derived from Stackelberg games called a Multi-stage Maze Network Game (M2NG) in which the characteristics of APT are fully considered. The possible plans of the attacker are compactly represented using attack graphs(AGs), but the compact representation of the attacker's strategies presents a computational challenge and reaching the Nash Equilibrium(NE) is NP-hard. We present a method that first translates AGs into Markov Decision Process(MDP) and then achieves the optimal SRs allocation using the policy hill-climbing(PHC) algorithm. Finally, we present an empirical evaluation of the model and analyze the scalability and sensitivity of the algorithm. Simulation results exhibit that our proposed reinforcement learning-based SRs allocation is feasible and efficient. |
DOI | 10.1109/ISCC50000.2020.9219722 |
Citation Key | zhang_defense_2020 |
- Metrics
- stealthy method
- Stackelberg games
- serious privacy leakage
- security of data
- Resiliency
- resilience
- reinforcement learning (RL)
- pubcrawl
- policy hill-climbing algorithm
- policy hill-climbing
- optimal network security
- network attacks
- Nash Equilibrium
- multistage plan
- Multistage Maze Network Game
- game theoretic security
- Markov processes
- learning (artificial intelligence)
- Human behavior
- game-theoretic framework
- game theory
- game model
- continuous method
- attack graphs
- APT
- advanced persistent threats
- advanced persistent threat
- Predictive Metrics
- Scalability
- Human Factors