Rate-optimal Design for Anytime Best Arm Identification
This work addresses a practical problem in scenarios like A/B testing by providing an anytime algorithm that does not require prior knowledge of the total budget, offering incremental improvements over existing methods.
The paper tackles the best arm identification problem under a limited sampling budget, proposing an algorithm called Almost Tracking that is provably minimax optimal and outperforms existing anytime and fixed-budget algorithms in experiments.
We consider the best arm identification problem, where the goal is to identify the arm with the highest mean reward from a set of $K$ arms under a limited sampling budget. This problem models many practical scenarios such as A/B testing. We consider a class of algorithms for this problem, which is provably minimax optimal up to a constant factor. This idea is a generalization of existing works in fixed-budget best arm identification, which are limited to a particular choice of risk measures. Based on the framework, we propose Almost Tracking, a closed-form algorithm that has a provable guarantee on the popular risk measure $H_1$. Unlike existing algorithms, Almost Tracking does not require the total budget in advance nor does it need to discard a significant part of samples, which gives a practical advantage. Through experiments on synthetic and real-world datasets, we show that our algorithm outperforms existing anytime algorithms as well as fixed-budget algorithms.