LGJan 16, 2025

Coded Deep Learning: Framework and Algorithm

arXiv:2501.09849v13 citationsh-index: 8IEEE Trans Inf Theory
Originality Incremental advance
AI Analysis

This addresses efficiency challenges in deep learning for resource-limited settings, offering a novel compression method that is incremental in its integration of coding concepts.

The paper tackles the high computational and storage complexity of deep learning by introducing a coded deep learning framework that compresses model weights and activations, reducing training and inference costs while maintaining performance. Empirical results show it outperforms state-of-the-art DNN compression algorithms.

The success of deep learning (DL) is often achieved with large models and high complexity during both training and post-training inferences, hindering training in resource-limited settings. To alleviate these issues, this paper introduces a new framework dubbed ``coded deep learning'' (CDL), which integrates information-theoretic coding concepts into the inner workings of DL, to significantly compress model weights and activations, reduce computational complexity at both training and post-training inference stages, and enable efficient model/data parallelism. Specifically, within CDL, (i) we first propose a novel probabilistic method for quantizing both model weights and activations, and its soft differentiable variant which offers an analytic formula for gradient calculation during training; (ii) both the forward and backward passes during training are executed over quantized weights and activations, eliminating most floating-point operations and reducing training complexity; (iii) during training, both weights and activations are entropy constrained so that they are compressible in an information-theoretic sense throughout training, thus reducing communication costs in model/data parallelism; and (iv) the trained model in CDL is by default in a quantized format with compressible quantized weights, reducing post-training inference and storage complexity. Additionally, a variant of CDL, namely relaxed CDL (R-CDL), is presented to further improve the trade-off between validation accuracy and compression though requiring full precision in training with other advantageous features of CDL intact. Extensive empirical results show that CDL and R-CDL outperform the state-of-the-art algorithms in DNN compression in the literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes