LGDCMar 10, 2020

Communication-efficient Variance-reduced Stochastic Gradient Descent

arXiv:2003.04686v11 citations
AI Analysis

This addresses communication bottlenecks in distributed machine learning, particularly for internet-of-things and mobile networks, though it is an incremental improvement on existing variance-reduction methods.

The paper tackles the problem of high communication costs in distributed optimization by proposing a communication-efficient variant of stochastic variance-reduced gradient descent that compresses information to a few bits. The result shows up to 95% reduction in communication complexity with minimal performance penalty and improved robustness compared to state-of-the-art methods.

We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced gradient and propose a novel approach to make it communication-efficient. That is, we compress the communicated information to a few bits while preserving the linear convergence rate of the original uncompressed algorithm. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95\%, with almost no noticeable penalty. Moreover, it is much more robust to quantization (in terms of maintaining the true minimizer and the convergence rate) than the state-of-the-art algorithms for solving distributed optimization problems. Our results have important implications for using machine learning over internet-of-things and mobile networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes