LGOCOct 14, 2022

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization

arXiv:2210.07863v133 citationsh-index: 27
Originality Highly original
AI Analysis

This work addresses a fundamental gap in understanding performance limits for decentralized optimization, which is crucial for large-scale machine learning applications.

The paper tackles the problem of determining the optimal convergence rate for non-convex stochastic decentralized optimization with general weight matrices, establishing this rate and also for functions satisfying the Polyak-Lojasiewicz condition, and develops a new algorithm to nearly achieve these rates.

Decentralized optimization is effective to save communication in large-scale machine learning. Although numerous algorithms have been proposed with theoretical guarantees and empirical successes, the performance limits in decentralized optimization, especially the influence of network topology and its associated weight matrix on the optimal convergence rate, have not been fully understood. While (Lu and Sa, 2021) have recently provided an optimal rate for non-convex stochastic decentralized optimization with weight matrices defined over linear graphs, the optimal rate with general weight matrices remains unclear. This paper revisits non-convex stochastic decentralized optimization and establishes an optimal convergence rate with general weight matrices. In addition, we also establish the optimal rate when non-convex loss functions further satisfy the Polyak-Lojasiewicz (PL) condition. Following existing lines of analysis in literature cannot achieve these results. Instead, we leverage the Ring-Lattice graph to admit general weight matrices while maintaining the optimal relation between the graph diameter and weight matrix connectivity. Lastly, we develop a new decentralized algorithm to nearly attain the above two optimal rates under additional mild conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes