OCLGDec 18, 2017

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

arXiv:1712.06585v13 citations
Originality Incremental advance
AI Analysis

This work addresses faster convergence to local minima for nonconvex optimization problems, which is incremental as it builds on existing methods by leveraging higher-order smoothness.

The paper tackles the problem of finding local minima in nonconvex optimization by exploiting third-order smoothness to escape saddle points more efficiently, resulting in a stochastic gradient complexity of $ ilde{O}(\varepsilon^{-10/3})$, which improves upon the state-of-the-art $ ilde{O}(\varepsilon^{-7/2})$ by a factor of $ ilde{O}(\varepsilon^{-1/6})$.

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs $\tilde{O}(ε^{-10/3})$ stochastic gradient evaluations to converge to an approximate local minimum $\mathbf{x}$, which satisfies $\|\nabla f(\mathbf{x})\|_2\leqε$ and $λ_{\min}(\nabla^2 f(\mathbf{x}))\geq -\sqrtε$ in the general stochastic optimization setting, where $\tilde{O}(\cdot)$ hides logarithm polynomial terms and constants. This improves upon the $\tilde{O}(ε^{-7/2})$ gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of $\tilde{O}(ε^{-1/6})$. For nonconvex finite-sum optimization, our algorithm also outperforms the best known algorithms in a certain regime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes