LGDCDSOCMay 28, 2022

Stochastic Gradient Methods with Compressed Communication for Decentralized Saddle Point Problems

arXiv:2205.14452v22 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses communication efficiency in decentralized optimization for distributed machine learning, but it is incremental as it builds on existing gradient methods with compression.

The paper tackles decentralized saddle-point problems by developing two stochastic gradient algorithms with compressed communication, achieving ε-accurate solutions with specified complexities, such as O((1+δ)^4/(L^2 κ_f^2 κ_g^2 ε)) for general settings and O((1+δ) max{κ_f^2, √δ κ_f^2 κ_g, κ_g} log(1/ε)) for finite sum settings, and shows competitive performance in experiments.

We develop two compression based stochastic gradient algorithms to solve a class of non-smooth strongly convex-strongly concave saddle-point problems in a decentralized setting (without a central server). Our first algorithm is a Restart-based Decentralized Proximal Stochastic Gradient method with Compression (C-RDPSG) for general stochastic settings. We provide rigorous theoretical guarantees of C-RDPSG with gradient computation complexity and communication complexity of order $\mathcal{O}( (1+δ)^4 \frac{1}{L^2}{κ_f^2}κ_g^2 \frac{1}ε )$, to achieve an $ε$-accurate saddle-point solution, where $δ$ denotes the compression factor, $κ_f$ and $κ_g$ denote respectively the condition numbers of objective function and communication graph, and $L$ denotes the smoothness parameter of the smooth part of the objective function. Next, we present a Decentralized Proximal Stochastic Variance Reduced Gradient algorithm with Compression (C-DPSVRG) for finite sum setting which exhibits gradient computation complexity and communication complexity of order $\mathcal{O} \left((1+δ) \max \{κ_f^2, \sqrtδκ^2_fκ_g,κ_g \} \log\left(\frac{1}ε\right) \right)$. Extensive numerical experiments show competitive performance of the proposed algorithms and provide support to the theoretical results obtained.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes