Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks
Title | Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Yu, Yiding, Wang, Taotao, Liew, Soung Chang |
Conference Name | 2018 IEEE International Conference on Communications (ICC) |
Date Published | May 2018 |
Publisher | IEEE |
ISBN Number | 978-1-5386-3180-5 |
Keywords | clean slate, clean-slate design, Collaboration, convergence, DARPA SC2, deep reinforcement learning, deep-reinforcement learning multiple access, design framework, DLMA, DRL agent, DRL algorithmic framework, heterogeneous wireless networks, Human Behavior, human factors, learning (artificial intelligence), MAC design, machine learning, machine-learning technique, Media Access Protocol, Metrics, network layers, neural nets, Neural networks, optimal MAC strategy, policy, Policy Based Governance, policy governance, prior knowledge, pubcrawl, radio networks, resilience, Resiliency, Robustness, share spectrum, telecommunication computing, time division multiple access, time-slotted networks, traditional reinforcement learning, universal MAC protocol, wireless networks |
Abstract | This paper investigates the use of deep reinforcement learning (DRL) in the design of a "universal" MAC protocol referred to as Deep-reinforcement Learning Multiple Access (DLMA). The design framework is partially inspired by the vision of DARPA SC2, a 3-year competition whereby competitors are to come up with a clean-slate design that "best share spectrum with any network(s), in any environment, without prior knowledge, leveraging on machine-learning technique". While the scope of DARPA SC2 is broad and involves the redesign of PHY, MAC, and Network layers, this paper's focus is narrower and only involves the MAC design. In particular, we consider the problem of sharing time slots among a multiple of time-slotted networks that adopt different MAC protocols. One of the MAC protocols is DLMA. The other two are TDMA and ALOHA. The DRL agents of DLMA do not know that the other two MAC protocols are TDMA and ALOHA. Yet, by a series of observations of the environment, its own actions, and the rewards - in accordance with the DRL algorithmic framework - a DRL agent can learn the optimal MAC strategy for harmonious co-existence with TDMA and ALOHA nodes. In particular, the use of neural networks in DRL (as opposed to traditional reinforcement learning) allows for fast convergence to optimal solutions and robustness against perturbation in hyper- parameter settings, two essential properties for practical deployment of DLMA in real wireless networks. |
URL | https://ieeexplore.ieee.org/document/8422168 |
DOI | 10.1109/ICC.2018.8422168 |
Citation Key | yu_deep-reinforcement_2018 |
- resilience
- neural nets
- Neural networks
- optimal MAC strategy
- Policy
- Policy Based Governance
- policy governance
- prior knowledge
- pubcrawl
- radio networks
- network layers
- Resiliency
- Robustness
- share spectrum
- telecommunication computing
- time division multiple access
- time-slotted networks
- traditional reinforcement learning
- universal MAC protocol
- wireless networks
- DRL algorithmic framework
- clean-slate design
- collaboration
- convergence
- DARPA SC2
- deep reinforcement learning
- deep-reinforcement learning multiple access
- design framework
- DLMA
- DRL agent
- clean slate
- heterogeneous wireless networks
- Human behavior
- Human Factors
- learning (artificial intelligence)
- MAC design
- machine learning
- machine-learning technique
- Media Access Protocol
- Metrics