Unified Analysis of Decentralized Gradient Descent: a Contraction Mapping Framework
This work offers a principled framework for analyzing decentralized optimization algorithms, which is incremental but improves accessibility for researchers and practitioners in decentralized machine learning and multi-agent systems.
The authors tackled the analysis of decentralized gradient descent (DGD) and diffusion algorithms for strongly convex, smooth objectives by proposing a contraction mapping framework with a mean Hessian theorem, yielding tight convergence bounds in noise-free and noisy regimes. This approach decouples algorithm dynamics from asymptotic properties, providing a simpler and more intuitive analysis.
The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.