Title | Policy Network Assisted Monte Carlo Tree Search for Intelligent Service Function Chain Deployment |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Fu, Zhihan, Fan, Qilin, Zhang, Xu, Li, Xiuhua, Wang, Sen, Wang, Yueyang |
Conference Name | 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) |
Keywords | composability, Human Behavior, Metrics, Monte Carlo methods, Monte Carlo Tree Search, Network security, privacy, pubcrawl, reinforcement learning, resilience, Resiliency, search problems, service function chain, service function chaining, Software, virtual network function, virtualization privacy |
Abstract | Network function virtualization (NFV) simplies the coniguration and management of security services by migrating the network security functions from dedicated hardware devices to software middle-boxes that run on commodity servers. Under the paradigm of NFV, the service function chain (SFC) consisting of a series of ordered virtual network security functions is becoming a mainstream form to carry network security services. Allocating the underlying physical network resources to the demands of SFCs under given constraints over time is known as the SFC deployment problem. It is a crucial issue for infrastructure providers. However, SFC deployment is facing new challenges in trading off between pursuing the objective of a high revenue-to-cost ratio and making decisions in an online manner. In this paper, we investigate the use of reinforcement learning to guide online deployment decisions for SFC requests and propose a Policy network Assisted Monte Carlo Tree search approach named PACT to address the above challenge, aiming to maximize the average revenue-to-cost ratio. PACT combines the strengths of the policy network, which evaluates the placement potential of physical servers, and the Monte Carlo Tree Search, which is able to tackle problems with large state spaces. Extensive experimental results demonstrate that our PACT achieves the best performance and is superior to other algorithms by up to 30% and 23.8% on average revenue-to-cost ratio and acceptance rate, respectively. |
DOI | 10.1109/TrustCom53373.2021.00157 |
Citation Key | fu_policy_2021 |