LGITMLMay 21, 2025

Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

arXiv:2505.15643v1h-index: 1Oper Res Lett
Originality Incremental advance
AI Analysis

This work addresses a specific theoretical gap in bandit algorithms for researchers, but it is incremental as it builds on existing methods to handle multiple optimal arms.

The paper tackled the problem of best-arm identification in stochastic multi-armed bandits with multiple optimal arms under fixed confidence, proposing a modified stopping rule for the Track-and-Stop algorithm that ensures instance-optimality. The result demonstrates that this rule tightly matches a new information-theoretic lower bound accounting for multiple optima.

We study the problem of best-arm identification in stochastic multi-armed bandits under the fixed-confidence setting, with a particular focus on instances that admit multiple optimal arms. While the Track-and-Stop algorithm of Garivier and Kaufmann (2016) is widely conjectured to be instance-optimal, its performance in the presence of multiple optima has remained insufficiently understood. In this work, we revisit the Track-and-Stop strategy and propose a modified stopping rule that ensures instance-optimality even when the set of optimal arms is not a singleton. Our analysis introduces a new information-theoretic lower bound that explicitly accounts for multiple optimal arms, and we demonstrate that our stopping rule tightly matches this bound.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes