LGOCMLNov 28, 2019

D-SPIDER-SFO: A Decentralized Optimization Algorithm with Faster Convergence Rate for Nonconvex Problems

arXiv:1911.12665v119 citations
Originality Incremental advance
AI Analysis

This work addresses decentralized optimization for large-scale machine learning, offering a state-of-the-art method with incremental improvements in convergence rate for nonconvex problems.

The paper tackles the problem of developing a decentralized optimization algorithm for nonconvex problems, proposing D-SPIDER-SFO, which achieves a gradient computation cost of O(ε^{-3}) for finding an ε-approximate first-order stationary point, matching its centralized counterpart.

Decentralized optimization algorithms have attracted intensive interests recently, as it has a balanced communication pattern, especially when solving large-scale machine learning problems. Stochastic Path Integrated Differential Estimator Stochastic First-Order method (SPIDER-SFO) nearly achieves the algorithmic lower bound in certain regimes for nonconvex problems. However, whether we can find a decentralized algorithm which achieves a similar convergence rate to SPIDER-SFO is still unclear. To tackle this problem, we propose a decentralized variant of SPIDER-SFO, called decentralized SPIDER-SFO (D-SPIDER-SFO). We show that D-SPIDER-SFO achieves a similar gradient computation cost---that is, $\mathcal{O}(ε^{-3})$ for finding an $ε$-approximate first-order stationary point---to its centralized counterpart. To the best of our knowledge, D-SPIDER-SFO achieves the state-of-the-art performance for solving nonconvex optimization problems on decentralized networks in terms of the computational cost. Experiments on different network configurations demonstrate the efficiency of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes