OCLGMLMay 25, 2016

NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization

arXiv:1605.07747v245 citations
Originality Incremental advance
AI Analysis

This work addresses distributed and stochastic optimization challenges in machine learning, offering significant speedups for nonconvex problems, though it is incremental as it builds on primal-dual methods.

The authors tackled the problem of distributed and stochastic optimization for nonconvex objectives with nonsmooth regularizers by proposing the NESTT algorithm, which achieves an $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient complexity for $\epsilon$-stationary solutions, up to $\mathcal{O}(N)$ times better than gradient descent methods, and Q-linear convergence for specific problems.

We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $ε$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/ε)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between primal-dual based methods and a few primal only methods such as IAG/SAG/SAGA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes