OCLGMay 23, 2023

Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

arXiv:2305.14161v28 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational limitation in optimization theory for researchers and practitioners, providing new theoretical guarantees without modifying the algorithm, though it is incremental as it builds on existing subgradient method analysis.

The paper tackles the problem of analyzing the subgradient method for nonsmooth optimization beyond Lipschitz continuity, extending complexity and convergence results to non-Lipschitz convex and weakly convex functions, achieving rates such as O(ε^{-2}) for convex cases and O(ε^{-4}) for weakly convex cases.

The subgradient method is one of the most fundamental algorithmic schemes for nonsmooth optimization. The existing complexity and convergence results for this method are mainly derived for Lipschitz continuous objective functions. In this work, we first extend the typical iteration complexity results for the subgradient method to cover non-Lipschitz convex and weakly convex minimization. Specifically, for the convex case, we can drive the suboptimality gap to below $\varepsilon$ in $\mathcal{O}( \varepsilon^{-2} )$ iterations; for the weakly convex case, we can drive the gradient norm of the Moreau envelope of the objective function to below $\varepsilon$ in $\mathcal{O}( \varepsilon^{-4} )$ iterations. Then, we provide convergence results for the subgradient method in the non-Lipschitz setting when proper diminishing rules on the step size are used. In particular, when $f$ is convex, we establish an $\mathcal{O}(\log(k)/\sqrt{k})$ rate of convergence in terms of the suboptimality gap, where $k$ represents the iteration count. With an additional quadratic growth property, the rate is improved to $\mathcal{O}(1/k)$ in terms of the squared distance to the optimal solution set. When $f$ is weakly convex, asymptotic convergence is established. Our results neither require any modification to the subgradient method nor impose any growth condition on the subgradients, while our analysis is surprisingly simple. To further illustrate the wide applicability of our framework, we extend the aforementioned iteration complexity results to cover the truncated subgradient, the stochastic subgradient, and the proximal subgradient methods for non-Lipschitz convex / weakly convex objective functions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes