OC LG MLMay 25, 2016

NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization

Davood Hajinezhad, Mingyi Hong, Tuo Zhao, Zhaoran Wang

arXiv:1605.07747v210.145 citations

Originality Incremental advance

AI Analysis

This work addresses distributed and stochastic optimization challenges in machine learning, offering significant speedups for nonconvex problems, though it is incremental as it builds on primal-dual methods.

The authors tackled the problem of distributed and stochastic optimization for nonconvex objectives with nonsmooth regularizers by proposing the NESTT algorithm, which achieves an $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient complexity for $\epsilon$-stationary solutions, up to $\mathcal{O}(N)$ times better than gradient descent methods, and Q-linear convergence for specific problems.

We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $ε$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/ε)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between primal-dual based methods and a few primal only methods such as IAG/SAG/SAGA.

View on arXiv PDF

Similar