Learning and Information in Stochastic Networks and Queues
This work addresses the integration of learning methods into queueing systems for improved decision-making, but it is largely a review and incremental synthesis of existing ideas.
The paper reviews how information and learning techniques from machine learning impact the stability and optimization of queueing systems, connecting queueing theory with adversarial learning and providing bounds on queue size regret using a perceptron algorithm.
We review the role of information and learning in the stability and optimization of queueing systems. In recent years, techniques from supervised learning, bandit learning and reinforcement learning have been applied to queueing systems supported by increasing role of information in decision making. We present observations and new results that help rationalize the application of these areas to queueing systems. We prove that the MaxWeight and BackPressure policies are an application of Blackwell's Approachability Theorem. This connects queueing theoretic results with adversarial learning. We then discuss the requirements of statistical learning for service parameter estimation. As an example, we show how queue size regret can be bounded when applying a perceptron algorithm to classify service. Next, we discuss the role of state information in improved decision making. Here we contrast the roles of epistemic information (information on uncertain parameters) and aleatoric information (information on an uncertain state). Finally we review recent advances in the theory of reinforcement learning and queueing, as well as, provide discussion on current research challenges.