LG MLJul 16, 2020

Comparator-adaptive Convex Bandits

Dirk van der Hoeven, Ashok Cutkosky, Haipeng Luo

arXiv:2007.08448v110.110 citations

Originality Incremental advance

AI Analysis

This work addresses comparator-adaptive methods for bandit convex optimization, an incremental advancement from full-information settings.

The paper tackles bandit convex optimization by developing algorithms that adapt to the comparator's norm, achieving small regret when the comparator norm is small, with extensions to Lipschitz or smooth loss functions using new gradient estimators and surrogate losses.

We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.

View on arXiv PDF

Similar