Linear Convergent Decentralized Optimization with Compression
This addresses the need for faster and more stable decentralized optimization in distributed machine learning, particularly for handling heterogeneous data, representing a novel advancement rather than an incremental improvement.
The paper tackles the problem of slow convergence and instability in decentralized optimization with communication compression by proposing LEAD, the first linearly convergent decentralized algorithm with compression, which achieves linear convergence on convex problems and is also applicable to non-convex deep neural networks.
Communication compression has become a key strategy to speed up distributed optimization. However, existing decentralized algorithms with compression mainly focus on compressing DGD-type algorithms. They are unsatisfactory in terms of convergence rate, stability, and the capability to handle heterogeneous data. Motivated by primal-dual algorithms, this paper proposes the first \underline{L}in\underline{EA}r convergent \underline{D}ecentralized algorithm with compression, LEAD. Our theory describes the coupled dynamics of the inexact primal and dual update as well as compression error, and we provide the first consensus error bound in such settings without assuming bounded gradients. Experiments on convex problems validate our theoretical analysis, and empirical study on deep neural nets shows that LEAD is applicable to non-convex problems.