Identifying All ε-Best Arms in (Misspecified) Linear Bandits
This work addresses the need to identify multiple promising candidates in high-cost tasks such as drug discovery, offering a near-optimal solution that is incremental in extending prior bandit methods.
The paper tackles the problem of efficiently identifying all near-optimal arms in linear bandits, motivated by applications like drug discovery, and proposes LinFACT, which achieves instance optimality by matching a novel lower bound on sample complexity up to a logarithmic factor.
Motivated by the need to efficiently identify multiple candidates in high trial-and-error cost tasks such as drug discovery, we propose a near-optimal algorithm to identify all ε-best arms (i.e., those at most ε worse than the optimum). Specifically, we introduce LinFACT, an algorithm designed to optimize the identification of all ε-best arms in linear bandits. We establish a novel information-theoretic lower bound on the sample complexity of this problem and demonstrate that LinFACT achieves instance optimality by matching this lower bound up to a logarithmic factor. A key ingredient of our proof is to integrate the lower bound directly into the scaling process for upper bound derivation, determining the termination round and thus the sample complexity. We also extend our analysis to settings with model misspecification and generalized linear models. Numerical experiments, including synthetic and real drug discovery data, demonstrate that LinFACT identifies more promising candidates with reduced sample complexity, offering significant computational efficiency and accelerating early-stage exploratory experiments.