SP AI LG MA NIDec 22, 2020

Scalable Deep Reinforcement Learning for Routing and Spectrum Access in Physical Layer

arXiv:2012.11783v24.319 citations

Originality Incremental advance

AI Analysis

This work aims to improve wireless ad-hoc network performance for network operators by jointly optimizing routing and spectrum access, which is an incremental improvement over existing methods.

This paper addresses simultaneous routing and spectrum access in wireless ad-hoc networks, considering physical-layer signal-to-interference-plus-noise ratio (SINR). It proposes a scalable reinforcement learning approach where a single agent per flow optimizes for bottleneck SINR, learning to avoid interference through joint routing and spectrum allocation.

This paper proposes a novel scalable reinforcement learning approach for simultaneous routing and spectrum access in wireless ad-hoc networks. In most previous works on reinforcement learning for network optimization, the network topology is assumed to be fixed, and a different agent is trained for each transmission node -- this limits scalability and generalizability. Further, routing and spectrum access are typically treated as separate tasks. Moreover, the optimization objective is usually a cumulative metric along the route, e.g., number of hops or delay. In this paper, we account for the physical-layer signal-to-interference-plus-noise ratio (SINR) in a wireless network and further show that bottleneck objective such as the minimum SINR along the route can also be optimized effectively using reinforcement learning. Specifically, we propose a scalable approach in which a single agent is associated with each flow and makes routing and spectrum access decisions as it moves along the frontier nodes. The agent is trained according to the physical-layer characteristics of the environment using a novel rewarding scheme based on the Monte Carlo estimation of the future bottleneck SINR. It learns to avoid interference by intelligently making joint routing and spectrum allocation decisions based on the geographical location information of the neighbouring nodes.

View on arXiv PDF

Similar