On the Existence of a Complexity in Fixed Budget Bandit Identification
This work addresses a foundational theoretical gap in bandit algorithms for researchers, showing that optimal performance is problem-dependent, which is incremental but clarifies limitations in algorithm design.
The paper tackles the problem of determining whether a universal optimal error rate exists for fixed budget bandit identification tasks, and finds that for several tasks, including Bernoulli best arm identification with two arms, no single algorithm can achieve the best possible rate everywhere.
In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of distributions. A good algorithm will have a small probability of error. While that probability decreases exponentially with the final time, the best attainable rate is not known precisely for most identification tasks. We show that if a fixed budget task admits a complexity, defined as a lower bound on the probability of error which is attained by the same algorithm on all bandit problems, then that complexity is determined by the best non-adaptive sampling procedure for that problem. We show that there is no such complexity for several fixed budget identification tasks including Bernoulli best arm identification with two arms: there is no single algorithm that attains everywhere the best possible rate.