LG MLOct 5, 2021

NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

Khaled Nakhleh, Santosh Ganji, Ping-Chun Hsieh, I-Hong Hou, Srinivas Shakkottai

arXiv:2110.02128v216.849 citationsHas Code

Originality Highly original

AI Analysis

This work addresses a challenging problem in restless bandits for researchers and practitioners in reinforcement learning and optimization, offering a novel method to compute Whittle indices more effectively.

The paper tackles the problem of finding Whittle indices for restless bandits, which is difficult due to convoluted transition kernels, by proposing NeurWIN, a neural network that learns these indices using deep reinforcement learning. The results show that NeurWIN significantly outperforms other RL algorithms in three recently studied restless bandit problems.

Whittle index policy is a powerful tool to obtain asymptotically optimal solutions for the notoriously intractable problem of restless bandits. However, finding the Whittle indices remains a difficult problem for many practical restless bandits with convoluted transition kernels. This paper proposes NeurWIN, a neural Whittle index network that seeks to learn the Whittle indices for any restless bandits by leveraging mathematical properties of the Whittle indices. We show that a neural network that produces the Whittle index is also one that produces the optimal control for a set of Markov decision problems. This property motivates using deep reinforcement learning for the training of NeurWIN. We demonstrate the utility of NeurWIN by evaluating its performance for three recently studied restless bandit problems. Our experiment results show that the performance of NeurWIN is significantly better than other RL algorithms.

View on arXiv PDF Code

Similar