ML LGSep 9, 2024

Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting

Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni, Alberto Maria Metelli

arXiv:2409.05980v13.1h-index: 38

Originality Incremental advance

AI Analysis

This work provides a unifying framework for sequential decision-making problems, but it is incremental as it extends existing bandit models.

The authors tackled the problem of generalizing rested and restless bandits by proposing Graph-Triggered Bandits (GTBs), a framework where arm rewards evolve based on a graph structure, with rising and rotting bandits as case studies, and they developed algorithms with theoretical guarantees.

Rested and Restless Bandits are two well-known bandit settings that are useful to model real-world sequential decision-making problems in which the expected reward of an arm evolves over time due to the actions we perform or due to the nature. In this work, we propose Graph-Triggered Bandits (GTBs), a unifying framework to generalize and extend rested and restless bandits. In this setting, the evolution of the arms' expected rewards is governed by a graph defined over the arms. An edge connecting a pair of arms $(i,j)$ represents the fact that a pull of arm $i$ triggers the evolution of arm $j$, and vice versa. Interestingly, rested and restless bandits are both special cases of our model for some suitable (degenerated) graph. As relevant case studies for this setting, we focus on two specific types of monotonic bandits: rising, where the expected reward of an arm grows as the number of triggers increases, and rotting, where the opposite behavior occurs. For these cases, we study the optimal policies. We provide suitable algorithms for all scenarios and discuss their theoretical guarantees, highlighting the complexity of the learning problem concerning instance-dependent terms that encode specific properties of the underlying graph structure.

View on arXiv PDF

Similar