Accelerated Distributed Optimization with Compression and Error Feedback
This work addresses communication efficiency in distributed machine learning, offering a novel method that is incremental but provides theoretical and practical improvements for large-scale optimization tasks.
The paper tackles the problem of communication bottlenecks in distributed optimization by proposing ADEF, a novel algorithm that integrates Nesterov acceleration with compression techniques, achieving the first accelerated convergence rate for stochastic distributed optimization with contractive compression in the general convex regime, as validated by numerical experiments showing reduced communication costs and fast convergence.
Modern machine learning tasks often involve massive datasets and models, necessitating distributed optimization algorithms with reduced communication overhead. Communication compression, where clients transmit compressed updates to a central server, has emerged as a key technique to mitigate communication bottlenecks. However, the theoretical understanding of stochastic distributed optimization with contractive compression remains limited, particularly in conjunction with Nesterov acceleration -- a cornerstone for achieving faster convergence in optimization. In this paper, we propose a novel algorithm, ADEF (Accelerated Distributed Error Feedback), which integrates Nesterov acceleration, contractive compression, error feedback, and gradient difference compression. We prove that ADEF achieves the first accelerated convergence rate for stochastic distributed optimization with contractive compression in the general convex regime. Numerical experiments validate our theoretical findings and demonstrate the practical efficacy of ADEF in reducing communication costs while maintaining fast convergence.