LGMay 27, 2016

Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture

arXiv:1605.08481v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses a foundational open problem in bandit theory, with incremental progress toward resolving long-standing gaps in understanding optimality.

The paper tackles the best arm identification problem in stochastic multi-armed bandits, aiming to understand its optimal sample complexity, and conjectures that an instance-optimal algorithm exists modulo an additive term, introducing gap entropy as a potential lower bound.

The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of $O(\sum_{i=2}^{n} Δ_{i}^{-2}(\lnδ^{-1} + \ln\lnΔ_i^{-1}))$ ($Δ_{i}$ is the difference between the largest mean and the $i^{th}$ mean), while the best known lower bound is $Ω(\sum_{i=2}^{n} Δ_{i}^{-2}\lnδ^{-1})$ for general instances and $Ω(Δ^{-2} \ln\ln Δ^{-1})$ for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term $Ω(Δ_2^{-2} \ln\ln Δ_2^{-1})$ (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes