NI LG MLSep 27, 2020

Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning

Ramkumar Raghu, Mahadesh Panju, Vaneet Aggarwal, Vinod Sharma

arXiv:2011.14799v12.31 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving performance in wireless multicast networks for content delivery, offering a scalable solution that can adapt to time-varying conditions, though it is incremental in applying existing deep reinforcement learning techniques to this specific domain.

The paper tackles the problem of power control and scheduling in wireless multicast systems under fading, using deep reinforcement learning to achieve scalability and adapt to changing dynamics, demonstrating that the learned policy matches optimal performance for small networks and scales to larger systems while maintaining power constraints.

Multicasting in wireless systems is a natural way to exploit the redundancy in user requests in a Content Centric Network. Power control and optimal scheduling can significantly improve the wireless multicast network's performance under fading. However, the model based approaches for power control and scheduling studied earlier are not scalable to large state space or changing system dynamics. In this paper, we use deep reinforcement learning where we use function approximation of the Q-function via a deep neural network to obtain a power control policy that matches the optimal policy for a small network. We show that power control policy can be learnt for reasonably large systems via this approach. Further we use multi-timescale stochastic optimization to maintain the average power constraint. We demonstrate that a slight modification of the learning algorithm allows tracking of time varying system statistics. Finally, we extend the multi-timescale approach to simultaneously learn the optimal queueing strategy along with power control. We demonstrate scalability, tracking and cross layer optimization capabilities of our algorithms via simulations. The proposed multi-timescale approach can be used in general large state space dynamical systems with multiple objectives and constraints, and may be of independent interest.

View on arXiv PDF

Similar