LG MLFeb 20, 2024

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets

arXiv:2402.12868v26.42 citationsh-index: 16NIPS

Originality Highly original

AI Analysis

This work addresses the problem of improving regret bounds in online convex optimization for researchers and practitioners, offering incremental advances by adapting to curvature properties of both loss functions and feasible sets.

The paper tackles online convex optimization by introducing a new condition and analysis that exploits the curvature of feasible sets to achieve fast regret rates, proving logarithmic regret bounds in stochastic environments and extending results to adversarial and corrupted settings with specific bounds like O(ρ log T) and O(√T).

In this work, we explore online convex optimization (OCO) and introduce a new condition and analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions exceeds a certain threshold, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This study reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. In particular, we first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret bound of $O(ρ\log T)$ in stochastic environments. Here, $ρ> 0$ is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set. Our approach, unlike existing ones, can work directly with convex loss functions, exploiting the curvature of loss functions simultaneously, and can achieve the logarithmic regret only with a local property of feasible sets. Additionally, the algorithm achieves an $O(\sqrt{T})$ regret even in adversarial environments, in which FTL suffers an $Ω(T)$ regret, and achieves an $O(ρ\log T + \sqrt{C ρ\log T})$ regret in corrupted stochastic environments with corruption level $C$. Furthermore, by extending our analysis, we establish a matching regret upper bound of $O\Big(T^{\frac{q-2}{2(q-1)}} (\log T)^{\frac{q}{2(q-1)}}\Big)$ for $q$-uniformly convex feasible sets, where uniformly convex sets include strongly convex sets and $\ell_p$-balls for $p \in [2,\infty)$. This bound bridges the gap between the $O(\log T)$ bound for strongly convex sets~($q=2$) and the $O(\sqrt{T})$ bound for non-curved sets~($q\to\infty$).

View on arXiv PDF

Similar