OCLGDec 12, 2017

Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity

arXiv:1712.04104v355 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical limitation in optimization for researchers and practitioners, providing incremental extensions to classic results.

The paper tackles the problem of extending convergence rate theory for subgradient methods to non-Lipschitz convex functions, achieving a global O(1/√T) rate for deterministic methods and O(1/√T) or O(1/T) rates for stochastic methods under specific conditions.

We extend the classic convergence rate theory for subgradient methods to apply to non-Lipschitz functions. For the deterministic projected subgradient method, we present a global $O(1/\sqrt{T})$ convergence rate for any convex function which is locally Lipschitz around its minimizers. This approach is based on Shor's classic subgradient analysis and implies generalizations of the standard convergence rates for gradient descent on functions with Lipschitz or Hölder continuous gradients. Further, we show a $O(1/\sqrt{T})$ convergence rate for the stochastic projected subgradient method on convex functions with at most quadratic growth, which improves to $O(1/T)$ under either strong convexity or a weaker quadratic lower bound condition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes