LGMay 1, 2021

Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling

arXiv:2105.00210v11 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of rapid network deployment in modern communication systems by providing a top-down scheduling approach, though it is incremental as it builds on existing policy gradient methods.

The paper tackles the problem of minimizing packet delay in complex, unknown constrained queueing networks by developing a policy gradient reinforcement learning algorithm that synthesizes a scheduler outperforming given atomic policies, achieving system stabilization even with nonstationary arrival rates where constituent policies fail.

We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely varying characteristics such as arrival rates and service times. This, coupled with the need for rapid network deployment, render a bottom up approach of first characterizing the traffic and then devising an appropriate scheduling protocol infeasible. In contrast, we formulate a top down approach to scheduling where, given an unknown network and a set of scheduling policies, we use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies. We derive convergence results and analyze finite time performance of the algorithm. Simulation results show that the algorithm performs well even when the arrival rates are nonstationary and can stabilize the system even when the constituent policies are unstable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes