LGOCMLMar 6, 2023

Accelerated Rates between Stochastic and Adversarial Online Convex Optimization

arXiv:2303.03272v210 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses a fundamental theoretical gap in online learning for non-i.i.d. and non-fully adversarial data, which is incremental as it extends known results from linear losses to more general cases.

The paper tackles the problem of online convex optimization in settings that interpolate between stochastic i.i.d. and fully adversarial losses, establishing novel regret bounds that depend on gradient variance rather than maximum gradient length and allowing for adversarially poisoned rounds, with results showing tightness across all intermediate regimes.

Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing, for example, adversarially poisoned rounds, which were previously considered in the related expert and bandit settings. In the fully i.i.d. case, our regret bounds match the rates one would expect from results in stochastic acceleration, and we also recover the optimal stochastically accelerated rates via online-to-batch conversion. In the fully adversarial case our bounds gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes in terms of the stochastic variance and the adversarial variation of the loss gradients.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes