LGMLJul 23, 2020

Efficient Residue Number System Based Winograd Convolution

arXiv:2007.12216v114 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks for low-precision CNN inference, enabling faster and less computationally intensive deployment in resource-constrained environments, though it is incremental as it builds on existing Winograd and RNS methods.

The paper tackles the challenge of applying the Winograd algorithm to low-precision quantized CNNs by extending it to a Residue Number System, achieving up to 7.03x arithmetic complexity reduction and 2.30x to 4.69x performance improvement for 3x3 and 5x5 filters without accuracy degradation.

Prior research has shown that Winograd algorithm can reduce the computational complexity of convolutional neural networks (CNN) with weights and activations represented in floating point. However it is difficult to apply the scheme to the inference of low-precision quantized (e.g. INT8) networks. Our work extends the Winograd algorithm to Residue Number System (RNS). The minimal complexity convolution is computed precisely over large transformation tile (e.g. 10 x 10 to 16 x 16) of filters and activation patches using the Winograd transformation and low cost (e.g. 8-bit) arithmetic without degrading the prediction accuracy of the networks during inference. The arithmetic complexity reduction is up to 7.03x while the performance improvement is up to 2.30x to 4.69x for 3 x 3 and 5 x 5 filters respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes