Quantaized Winograd/Toom-Cook Convolution for DNNs: Beyond Canonical Polynomials Base
This work addresses accuracy issues in low-precision Winograd convolution for DNNs, offering an incremental improvement for efficient inference in resource-constrained environments.
The paper tackles the numerical accuracy problem of Winograd convolution in quantized deep neural networks by applying a base change technique, achieving nearly the same accuracy (up to 0.5% loss) as direct convolution for an 8-bit quantized ResNet18 on CIFAR10 with few extra operations.
The problem how to speed up the convolution computations in Deep Neural Networks is widely investigated in recent years. The Winograd convolution algorithm is a common used method that significantly reduces time consumption. However, it suffers from a problem with numerical accuracy particularly for lower precisions. In this paper we present the application of base change technique for quantized Winograd-aware training model. We show that we can train the $8$ bit quantized network to nearly the same accuracy (up to 0.5% loss) for tested network (Resnet18) and dataset (CIFAR10) as for quantized direct convolution with few additional operations in pre/post transformations. Keeping Hadamard product on $9$ bits allow us to obtain the same accuracy as for direct convolution.