LGAICYSIDec 9, 2022

Networked Restless Bandits with Positive Externalities

arXiv:2212.05144v17 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses resource allocation tasks in communities with positive externalities, offering a novel extension to restless bandits that is domain-specific and incremental in nature.

The paper tackles the problem of resource allocation in restless multi-armed bandits by introducing networked restless bandits, where arms benefit from neighbors receiving resources, and presents Greta, a graph-aware algorithm that outperforms comparison policies across various hyperparameter values and graph topologies.

Restless multi-armed bandits are often used to model budget-constrained resource allocation tasks where receipt of the resource is associated with an increased probability of a favorable state transition. Prior work assumes that individual arms only benefit if they receive the resource directly. However, many allocation tasks occur within communities and can be characterized by positive externalities that allow arms to derive partial benefit when their neighbor(s) receive the resource. We thus introduce networked restless bandits, a novel multi-armed bandit setting in which arms are both restless and embedded within a directed graph. We then present Greta, a graph-aware, Whittle index-based heuristic algorithm that can be used to efficiently construct a constrained reward-maximizing action vector at each timestep. Our empirical results demonstrate that Greta outperforms comparison policies across a range of hyperparameter values and graph topologies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes