Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs

Vishal Shashidhar, Anupam Kumari, Roy P Paily

arXiv:2603.10100v14.8h-index: 25

Predicted impact top 90% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This addresses energy-efficient inference for resource-constrained edge devices, though it is incremental as it builds on sparsity techniques.

The paper tackles the high computational demands of CNNs for edge deployment by proposing a 'soft sparsity' method using a Most Significant Bit proxy to skip negligible non-zero multiplications, reducing MACs by up to 88.42% with zero accuracy loss and estimating power savings of up to 35.2%.

Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh. We propose a ``soft sparsity'' paradigm using a hardware efficient Most Significant Bit (MSB) proxy to skip negligible non-zero multiplications. Integrated as a custom RISC-V instruction and evaluated on LeNet-5 (MNIST), this method reduces ReLU MACs by 88.42% and Tanh MACs by 74.87% with zero accuracy loss--outperforming zero-skipping by 5x. By clock-gating inactive multipliers, we estimate power savings of 35.2\% for ReLU and 29.96\% for Tanh. While memory access makes power reduction sub-linear to operation savings, this approach significantly optimizes resource-constrained inference.

View on arXiv PDF

Similar