LGDCOCMLDec 30, 2019

Variance Reduced Local SGD with Lower Communication Complexity

arXiv:1912.12844v1160 citations
Originality Incremental advance
AI Analysis

This work addresses communication bottlenecks in distributed machine learning for scenarios with heterogeneous data, offering an incremental improvement over Local SGD.

The paper tackles the problem of high communication complexity in distributed SGD with non-identical data distributions by proposing Variance Reduced Local SGD (VRL-SGD), which reduces communication complexity from O(T^{3/4} N^{3/4}) to O(T^{1/2} N^{3/2}) while maintaining linear iteration speedup, as demonstrated in experiments showing better performance with diverse data.

To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training. Among them, Local SGD has gained much attention due to its lower communication cost. Nevertheless, when the data distribution on workers is non-identical, Local SGD requires $O(T^{\frac{3}{4}} N^{\frac{3}{4}})$ communications to maintain its \emph{linear iteration speedup} property, where $T$ is the total number of iterations and $N$ is the number of workers. In this paper, we propose Variance Reduced Local SGD (VRL-SGD) to further reduce the communication complexity. Benefiting from eliminating the dependency on the gradient variance among workers, we theoretically prove that VRL-SGD achieves a \emph{linear iteration speedup} with a lower communication complexity $O(T^{\frac{1}{2}} N^{\frac{3}{2}})$ even if workers access non-identical datasets. We conduct experiments on three machine learning tasks, and the experimental results demonstrate that VRL-SGD performs impressively better than Local SGD when the data among workers are quite diverse.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes