Online Optimization : Competing with Dynamic Comparators
This work addresses online optimization for scenarios requiring adaptation to dynamic comparators, offering incremental improvements in regret bounds for applications like game theory.
The paper tackles the problem of online learning by developing a fully adaptive method that competes with dynamic benchmarks, with regret guarantees scaling with the regularity of cost functions and comparators, and applies this to drifting zero-sum games where players achieve no regret against best sequences in hindsight.
Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop prediction methods that perform well against complex benchmarks. In this paper, we address these two directions together. We present a fully adaptive method that competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment. Finally, we apply our results to drifting zero-sum, two-player games where both players achieve no regret guarantees against best sequences of actions in hindsight.