LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication
This work addresses the need for efficient and fast-converging decentralized learning algorithms, particularly for federated learning and large-scale distributed systems.
LoDAdaC proposes a unified decentralized learning framework combining multiple local training steps, adaptive gradient methods (e.g., Adam), and compressed communication to achieve fast convergence and low communication cost. Experiments on image classification and GPT-style language model training show it significantly outperforms existing decentralized algorithms in convergence speed and communication efficiency.
In the decentralized distributed learning, achieving fast convergence and low communication cost is essential for scalability and high efficiency. Adaptive gradient methods, such as Adam, have demonstrated strong practical performance in deep learning and centralized distributed settings. However, their convergence properties remain largely unexplored in decentralized settings involving multiple local training steps, such as federated learning. To address this limitation, we propose LoDAdaC, a unified multiple Local Training (MLT) Decentralized framework with Adam-type updates and Compressed communication (CC). LoDAdaC accommodates a broad class of optimizers for its local adaptive updates, including AMSGrad, Adam, and AdaGrad; it is compatible with standard (possibly biased) compressors such as low-bit quantization and sparsification. MLT and CC enable LoDAdaC to achieve multiplied reduction of communication cost, while the technique of adaptive updates enables fast convergence. We rigorously prove the combined advantage through complexity analysis. In addition, experiments on image classification and GPT-style language model training validate our theoretical findings and show that LoDAdaC significantly outperforms existing decentralized algorithms in terms of convergence speed and communication efficiency.