Decentralized Optimization with Topology-Independent Communication
This work addresses the problem of efficient distributed optimization for large-scale systems, which is crucial for applications in machine learning and data science.
The authors tackled the problem of distributed optimization with a novel method that reduces communication complexity, achieving a convergence rate of $ ilde{mathcal{O}}(varepsilon^{-2})$ iterations for convex objectives and $mathcal{O}(varepsilon^{-1})$ to an $varepsilon$-solution for strongly convex objectives. The method reduces expected communication to exactly 2 messages per iteration.
Distributed optimization requires nodes to coordinate, yet full synchronization scales poorly. When $n$ nodes collaborate through $m$ pairwise regularizers, standard methods demand $\mathcal{O}(m)$ communications per iteration. This paper proposes randomized local coordination: each node independently samples one regularizer uniformly and coordinates only with nodes sharing that term. This exploits partial separability, where each regularizer $G_j$ depends on a subset $S_j \subseteq \{1,\ldots,n\}$ of nodes. For graph-guided regularizers where $|S_j|=2$, expected communication drops to exactly 2 messages per iteration. This method achieves $\tilde{\mathcal{O}}(\varepsilon^{-2})$ iterations for convex objectives and under strong convexity, $\mathcal{O}(\varepsilon^{-1})$ to an $\varepsilon$-solution and $\mathcal{O}(\log(1/\varepsilon))$ to a neighborhood. Replacing the proximal map of the sum $\sum_j G_j$ with the proximal map of a single randomly selected regularizer $G_j$ preserves convergence while eliminating global coordination. Experiments validate both convergence rates and communication efficiency across synthetic and real-world datasets.