Polynomial Cost of Adaptation for X -Armed Bandits
This work addresses the challenge of optimizing bandit algorithms when function smoothness is unknown, offering a solution with proven optimality, though it is incremental in refining adaptation costs.
The paper tackles the problem of adapting to unknown smoothness in stochastic continuum-armed bandits by presenting an algorithm that achieves a polynomial cost of adaptation for regret minimization, matching minimal rate functions derived from a lower bound.
In the context of stochastic continuum-armed bandits, we present an algorithm that adapts to the unknown smoothness of the objective function. We exhibit and compute a polynomial cost of adaptation to the H{ö}lder regularity for regret minimization. To do this, we first reconsider the recent lower bound of Locatelli and Carpentier [20], and define and characterize admissible rate functions. Our new algorithm matches any of these minimal rate functions. We provide a finite-time analysis and a thorough discussion about asymptotic optimality.