LG OC MLFeb 15, 2022

Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds via Smoothness

Sarah Sachs, Hédi Hadiji, Tim van Erven, Cristóbal Guzmán

arXiv:2202.07554v215.129 citations

Originality Highly original

AI Analysis

This work addresses a fundamental gap in online learning theory for non-i.i.d. and non-fully adversarial scenarios, which is incremental as it extends prior results from linear losses and expert/bandit settings to the broader online convex optimization framework.

The paper tackles the problem of online convex optimization in settings that interpolate between stochastic and adversarial data, establishing novel regret bounds that depend on gradient variance rather than maximum gradient length and allowing for adversarially poisoned rounds. The results show tight bounds across all regimes, matching expected rates in i.i.d. cases and gracefully deteriorating to minimax regret in adversarial cases.

Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing, for example, adversarially poisoned rounds, which were previously considered in the expert and bandit setting. Our results extend this to the online convex optimization framework. In the fully i.i.d. case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes in terms of the stochastic variance and the adversarial variation of the loss gradients.

View on arXiv PDF

Similar