OCLGOct 14, 2022

Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate

arXiv:2210.07881v244 citationsh-index: 80Has Code
Originality Highly original
AI Analysis

This work addresses communication bottlenecks in decentralized optimization for distributed machine learning, offering a novel solution that is incremental in improving existing topology designs.

The paper tackles the problem of inefficient information mixing in decentralized learning due to suboptimal communication topologies, proposing a new family of topologies called EquiTopo that achieves an $n$-independent consensus rate, leading to faster communication and better convergence in decentralized SGD and gradient tracking, with theoretical and empirical results showing improved performance.

Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server. Since communication tends to be slower than computation, when each agent communicates with only a few neighboring agents per iteration, they can complete iterations faster than with more agents or a central server. However, the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is ``mixed'' by communication. We found that popular communication topologies either have large maximum degrees (such as stars and complete graphs) or are ineffective at mixing information (such as rings and grids). To address this problem, we propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate that is used to measure the mixing efficiency. In the proposed family, EquiStatic has a degree of $Θ(\ln(n))$, where $n$ is the network size, and a series of time-dependent one-peer topologies, EquiDyn, has a constant degree of 1. We generate EquiDyn through a certain random sampling procedure. Both of them achieve an $n$-independent consensus rate. We apply them to decentralized SGD and decentralized gradient tracking and obtain faster communication and better convergence, theoretically and empirically. Our code is implemented through BlueFog and available at \url{https://github.com/kexinjinnn/EquiTopo}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes