Breaking the $\log(1/Δ_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
This addresses the challenge of minimizing samples and batches for identifying the best arm in bandit problems, with incremental improvements in batch efficiency.
The paper tackles the problem of batched best arm identification in multi-armed bandits by introducing an algorithm that achieves near-optimal sample complexity and breaks the log(1/Δ₂) barrier for batch complexity, with experimental results showing improved batch efficiency across various setups.
We investigate the problem of batched best arm identification in multi-armed bandits, where we aim to identify the best arm from a set of $n$ arms while minimizing both the number of samples and batches. We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity, which breaks the $\log(1/Δ_2)$ barrier. The main contribution of our algorithm is a novel sample allocation scheme that effectively balances exploration and exploitation for batch sizes. Experimental results indicate that our approach is more batch-efficient across various setups. We also extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.