LGOCNov 6, 2020

Communication-efficient Decentralized Local SGD over Undirected Networks

arXiv:2011.03255v114 citations
AI Analysis

This work addresses communication efficiency in decentralized machine learning for networks of agents, though it is incremental as it extends prior analysis to more general topologies.

The paper tackles the problem of distributed learning over arbitrary networks by generalizing Decentralized Local SGD from star topologies to any undirected network, achieving an error scaling of O(1/(nT)) with only Ω(n) communication rounds independent of the number of iterations.

We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$. Agents have access to $F$ through noisy gradients, and they can locally communicate with their neighbors a network. We study the Decentralized Local SDG method, where agents perform a number of local gradient steps and occasionally exchange information with their neighbors. Previous algorithmic analysis efforts have focused on the specific network topology (star topology) where a leader node aggregates all agents' information. We generalize that setting to an arbitrary network by analyzing the trade-off between the number of communication rounds and the computational effort of each agent. We bound the expected optimality gap in terms of the number of iterates $T$, the number of workers $n$, and the spectral gap of the underlying network. Our main results show that by using only $R=Ω(n)$ communication rounds, one can achieve an error that scales as $O({1}/{nT})$, where the number of communication rounds is independent of $T$ and only depends on the number of agents. Finally, we provide numerical evidence of our theoretical results through experiments on real and synthetic data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes