LG IT MA SP MLFeb 14, 2020

Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning

Navid Naderializadeh, Jaroslaw Sydir, Meryem Simsek, Hosein Nikopour

arXiv:2002.06215v217.6182 citations

Originality Incremental advance

AI Analysis

This addresses resource allocation challenges for wireless network operators, but it is incremental as it builds on existing multi-agent RL methods.

The paper tackles distributed resource management and interference mitigation in wireless networks by using multi-agent deep reinforcement learning, where each transmitter acts as an agent making decisions based on delayed and exchanged observations. Simulation results show the approach outperforms decentralized baselines in tradeoffs between average and 5th percentile user rates, achieving performance close to or better than a centralized baseline in some cases.

We propose a mechanism for distributed resource management and interference mitigation in wireless networks using multi-agent deep reinforcement learning (RL). We equip each transmitter in the network with a deep RL agent that receives delayed observations from its associated users, while also exchanging observations with its neighboring agents, and decides on which user to serve and what transmit power to use at each scheduling interval. Our proposed framework enables agents to make decisions simultaneously and in a distributed manner, unaware of the concurrent decisions of other agents. Moreover, our design of the agents' observation and action spaces is scalable, in the sense that an agent trained on a scenario with a specific number of transmitters and users can be applied to scenarios with different numbers of transmitters and/or users. Simulation results demonstrate the superiority of our proposed approach compared to decentralized baselines in terms of the tradeoff between average and $5^{th}$ percentile user rates, while achieving performance close to, and even in certain cases outperforming, that of a centralized information-theoretic baseline. We also show that our trained agents are robust and maintain their performance gains when experiencing mismatches between train and test deployments.

View on arXiv PDF

Similar