OC LG NE MLMar 19, 2016

Stochastic Variance Reduction for Nonconvex Optimization

Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

arXiv:1603.06160v250.2653 citations

Originality Incremental advance

AI Analysis

This work addresses optimization challenges in machine learning for nonconvex settings, providing theoretical guarantees for SVRG, which is incremental as it extends existing convex methods to nonconvex problems.

The paper tackles nonconvex finite-sum optimization problems by analyzing stochastic variance reduced gradient (SVRG) methods, proving non-asymptotic convergence rates to stationary points and showing SVRG is provably faster than SGD and gradient descent, with linear convergence to the global optimum for a subclass of problems.

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to mini-batching in parallel settings.

View on arXiv PDF

Similar