Optimal Thresholding Linear Bandit
This work addresses a pure exploration problem in bandit theory, which is incremental as it builds on existing methods for linear bandits.
The paper tackles the ε-Thresholding Bandit Problem in stochastic linear bandits by proving a lower bound on sample complexity and extending an algorithm for Best Arm Identification to achieve asymptotic optimality.
We study a novel pure exploration problem: the $ε$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm Identification in the linear case to TBP that is asymptotically optimal.