OC DS LG MLFeb 2, 2017

Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

arXiv:1702.00763v526.182 citations

Originality Incremental advance

AI Analysis

This work addresses optimization challenges in machine learning by improving efficiency for non-convex problems, though it appears incremental as it builds on existing methods with specific parameter adjustments.

The paper tackles the problem of finding approximate stationary points for non-convex stochastic optimization by introducing methods whose convergence depends on the smallest negative eigenvalue of the Hessian, showing that these methods outperform known results for certain parameter ranges and revealing a dichotomy in scaling behaviors.

Given a nonconvex function that is an average of $n$ smooth functions, we design stochastic first-order methods to find its approximate stationary points. The convergence of our new methods depends on the smallest (negative) eigenvalue $-σ$ of the Hessian, a parameter that describes how nonconvex the function is. Our methods outperform known results for a range of parameter $σ$, and can be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold $σ_0$ so that the currently fastest methods for $σ>σ_0$ and for $σ<σ_0$ have different behaviors: the former scales with $n^{2/3}$ and the latter scales with $n^{3/4}$.

View on arXiv PDF

Similar