OC LG MLMay 23, 2016

Fast Stochastic Methods for Nonsmooth Nonconvex Optimization

Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

arXiv:1605.06900v125.154 citations

Originality Highly original

AI Analysis

This addresses a fundamental gap in optimization theory for machine learning, providing theoretical guarantees for stochastic methods in nonsmooth nonconvex settings, which is incremental but important for practitioners dealing with complex models.

The paper tackles the problem of optimizing nonconvex, nonsmooth finite-sum problems by developing fast stochastic algorithms that provably converge to a stationary point for constant minibatches and show faster convergence than batch proximal gradient descent, with global linear convergence for a subclass of functions.

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonconvex part is smooth and the nonsmooth part is convex. Surprisingly, unlike the smooth case, our knowledge of this fundamental problem is very limited. For example, it is not known whether the proximal stochastic gradient method with constant minibatch converges to a stationary point. To tackle this issue, we develop fast stochastic algorithms that provably converge to a stationary point for constant minibatches. Furthermore, using a variant of these algorithms, we show provably faster convergence than batch proximal gradient descent. Finally, we prove global linear convergence rate for an interesting subclass of nonsmooth nonconvex functions, that subsumes several recent works. This paper builds upon our recent series of papers on fast stochastic methods for smooth nonconvex optimization [22, 23], with a novel analysis for nonconvex and nonsmooth functions.

View on arXiv PDF

Similar