LGOCOct 15, 2020

Revisiting Projection-free Online Learning: the Strongly Convex Case

arXiv:2010.07572v231 citations
AI Analysis

This work addresses computational efficiency in high-dimensional constrained optimization for machine learning applications, though it is incremental as it fills a gap in existing theory.

The paper tackles the problem of projection-free online learning for strongly convex functions, showing that the Online Frank-Wolfe method achieves a faster rate of O(T^{2/3}) compared to O(T^{3/4}) for convex functions, and proves a similar bound for the bandit setting.

Projection-free optimization algorithms, which are mostly based on the classical Frank-Wolfe method, have gained significant interest in the machine learning community in recent years due to their ability to handle convex constraints that are popular in many applications, but for which computing projections is often computationally impractical in high-dimensional settings, and hence prohibit the use of most standard projection-based methods. In particular, a significant research effort was put on projection-free methods for online learning. In this paper we revisit the Online Frank-Wolfe (OFW) method suggested by Hazan and Kale \cite{Hazan12} and fill a gap that has been left unnoticed for several years: OFW achieves a faster rate of $O(T^{2/3})$ on strongly convex functions (as opposed to the standard $O(T^{3/4})$ for convex but not strongly convex functions), where $T$ is the sequence length. This is somewhat surprising since it is known that for offline optimization, in general, strong convexity does not lead to faster rates for Frank-Wolfe. We also revisit the bandit setting under strong convexity and prove a similar bound of $\tilde O(T^{2/3})$ (instead of $O(T^{3/4})$ without strong convexity). Hence, in the current state-of-affairs, the best projection-free upper-bounds for the full-information and bandit settings with strongly convex and nonsmooth functions match up to logarithmic factors in $T$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes