OCLGNov 10, 2020

Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems

arXiv:2011.05082v145 citations
AI Analysis

This addresses the challenge of distributed optimization in machine learning for non-convex and non-smooth settings, offering improved communication efficiency, though it is incremental relative to prior work.

The paper tackles distributed optimization for non-convex and non-smooth problems by proposing a stochastic algorithm with Nesterov momentum, achieving an ε-stationary solution with O(1/ε²) computation and O(1/ε) communication complexity, which improves communication efficiency over existing methods.

While many distributed optimization algorithms have been proposed for solving smooth or convex problems over the networks, few of them can handle non-convex and non-smooth problems. Based on a proximal primal-dual approach, this paper presents a new (stochastic) distributed algorithm with Nesterov momentum for accelerated optimization of non-convex and non-smooth problems. Theoretically, we show that the proposed algorithm can achieve an $ε$-stationary solution under a constant step size with $\mathcal{O}(1/ε^2)$ computation complexity and $\mathcal{O}(1/ε)$ communication complexity. When compared to the existing gradient tracking based methods, the proposed algorithm has the same order of computation complexity but lower order of communication complexity. To the best of our knowledge, the presented result is the first stochastic algorithm with the $\mathcal{O}(1/ε)$ communication complexity for non-convex and non-smooth problems. Numerical experiments for a distributed non-convex regression problem and a deep neural network based classification problem are presented to illustrate the effectiveness of the proposed algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes