LGMar 17, 2022

Convert, compress, correct: Three steps toward communication-efficient DNN training

arXiv:2203.09044v1h-index: 20
Originality Synthesis-oriented
AI Analysis

This addresses communication bottlenecks in distributed DNN training, offering a domain-specific solution that appears incremental in combining existing techniques.

The paper tackles communication inefficiency in distributed deep neural network training by introducing the CO3 algorithm, which combines quantization, compression, and error correction, achieving improved performance as demonstrated through numerical evaluations on CIFAR-10.

In this paper, we introduce a novel algorithm, $\mathsf{CO}_3$, for communication-efficiency distributed Deep Neural Network (DNN) training. $\mathsf{CO}_3$ is a joint training/communication protocol, which encompasses three processing steps for the network gradients: (i) quantization through floating-point conversion, (ii) lossless compression, and (iii) error correction. These three components are crucial in the implementation of distributed DNN training over rate-constrained links. The interplay of these three steps in processing the DNN gradients is carefully balanced to yield a robust and high-performance scheme. The performance of the proposed scheme is investigated through numerical evaluations over CIFAR-10.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes