NI AI LG SPNov 15, 2021

Learning Robust Scheduling with Search and Attention

David Sandberg, Tor Kvernvik, Francesco Davide Calabrese

arXiv:2111.08073v12.35 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of efficiently allocating radio resources in MU-MIMO systems, which is crucial for wireless network management, but it is an incremental improvement as it adapts existing methods like AlphaGo Zero to a specific domain.

The paper tackled the MU-MIMO scheduling problem by treating it as a tree-structured combinatorial optimization and using a combination of Monte Carlo Tree Search and Reinforcement Learning with a self-attention neural network, demonstrating that it vastly outperforms state-of-the-art heuristic-based approaches in the presence of measurement uncertainties and finite buffers.

Allocating physical layer resources to users based on channel quality, buffer size, requirements and constraints represents one of the central optimization problems in the management of radio resources. The solution space grows combinatorially with the cardinality of each dimension making it hard to find optimal solutions using an exhaustive search or even classical optimization algorithms given the stringent time requirements. This problem is even more pronounced in MU-MIMO scheduling where the scheduler can assign multiple users to the same time-frequency physical resources. Traditional approaches thus resort to designing heuristics that trade optimality in favor of feasibility of execution. In this work we treat the MU-MIMO scheduling problem as a tree-structured combinatorial problem and, borrowing from the recent successes of AlphaGo Zero, we investigate the feasibility of searching for the best performing solutions using a combination of Monte Carlo Tree Search and Reinforcement Learning. To cater to the nature of the problem at hand, like the lack of an intrinsic ordering of the users as well as the importance of dependencies between combinations of users, we make fundamental modifications to the neural network architecture by introducing the self-attention mechanism. We then demonstrate that the resulting approach is not only feasible but vastly outperforms state-of-the-art heuristic-based scheduling approaches in the presence of measurement uncertainties and finite buffers.

View on arXiv PDF

Similar