OC DS LG NE MLAug 29, 2017

Natasha 2: Faster Non-Convex Optimization Than SGD

arXiv:1708.08694v445.3258 citations

Originality Highly original

AI Analysis

This provides a faster optimization method for non-convex functions, which is incremental but offers concrete improvements in computational efficiency for machine learning practitioners.

The paper tackles the problem of training smooth neural networks to find approximate local minima, achieving a faster rate of O(ε^{-3.25}) backpropagations compared to the previous best of O(ε^{-4}) by SGD.

We design a stochastic algorithm to train any smooth neural network to $\varepsilon$-approximate local minima, using $O(\varepsilon^{-3.25})$ backpropagations. The best result was essentially $O(\varepsilon^{-4})$ by SGD. More broadly, it finds $\varepsilon$-approximate local minima of any smooth nonconvex function in rate $O(\varepsilon^{-3.25})$, with only oracle access to stochastic gradients.

View on arXiv PDF

Similar