LG AI MLFeb 14, 2012

Graphical Models for Bandit Problems

arXiv:1202.3782v120 citations

Originality Incremental advance

AI Analysis

This addresses the problem of computational efficiency in bandit algorithms for researchers and practitioners dealing with high-dimensional decision-making, though it appears incremental as it builds on existing graphical model frameworks.

The paper tackles the challenge of scaling multi-armed bandit problems to large state and action spaces by introducing graphical models that succinctly specify payoffs, resulting in an algorithm with regret bounded by the number of parameters and running time dependent on the treewidth of the action space graph.

We introduce a rich class of graphical models for multi-armed bandit problems that permit both the state or context space and the action space to be very large, yet succinctly specify the payoffs for any context-action pair. Our main result is an algorithm for such models whose regret is bounded by the number of parameters and whose running time depends only on the treewidth of the graph substructure induced by the action space.

View on arXiv PDF

Similar