OC LGDec 18, 2017

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

arXiv:1712.06585v18.03 citations

Originality Incremental advance

AI Analysis

This work addresses faster convergence to local minima for nonconvex optimization problems, which is incremental as it builds on existing methods by leveraging higher-order smoothness.

The paper tackles the problem of finding local minima in nonconvex optimization by exploiting third-order smoothness to escape saddle points more efficiently, resulting in a stochastic gradient complexity of $ ilde{O}(\varepsilon^{-10/3})$, which improves upon the state-of-the-art $ ilde{O}(\varepsilon^{-7/2})$ by a factor of $ ilde{O}(\varepsilon^{-1/6})$.

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs $\tilde{O}(ε^{-10/3})$ stochastic gradient evaluations to converge to an approximate local minimum $\mathbf{x}$, which satisfies $\|\nabla f(\mathbf{x})\|_2\leqε$ and $λ_{\min}(\nabla^2 f(\mathbf{x}))\geq -\sqrtε$ in the general stochastic optimization setting, where $\tilde{O}(\cdot)$ hides logarithm polynomial terms and constants. This improves upon the $\tilde{O}(ε^{-7/2})$ gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of $\tilde{O}(ε^{-1/6})$. For nonconvex finite-sum optimization, our algorithm also outperforms the best known algorithms in a certain regime.

View on arXiv PDF

Similar