LGITSPApr 24, 2025

Coding for Computation: Efficient Compression of Neural Networks for Reconfigurable Hardware

arXiv:2504.17403v11 citationsh-index: 15SSP
Originality Incremental advance
AI Analysis

This addresses the challenge of implementing large neural networks efficiently on hardware like FPGAs, though it appears incremental as it builds on existing compression techniques.

The paper tackles the problem of resource-efficient neural network inference on reconfigurable hardware by introducing a compression scheme that reduces the number of computations, specifically additions, achieving competitive performance for models like ResNet-34.

As state of the art neural networks (NNs) continue to grow in size, their resource-efficient implementation becomes ever more important. In this paper, we introduce a compression scheme that reduces the number of computations required for NN inference on reconfigurable hardware such as FPGAs. This is achieved by combining pruning via regularized training, weight sharing and linear computation coding (LCC). Contrary to common NN compression techniques, where the objective is to reduce the memory used for storing the weights of the NNs, our approach is optimized to reduce the number of additions required for inference in a hardware-friendly manner. The proposed scheme achieves competitive performance for simple multilayer perceptrons, as well as for large scale deep NNs such as ResNet-34.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes