CVAIFeb 26, 2021

Accelerating Large Kernel Convolutions with Nested Winograd Transformation.pdf

arXiv:2102.13272v26 citations
AI Analysis

This work addresses a bottleneck in accelerating large kernel CNNs for computer vision tasks, offering a more efficient method for AI processors, though it is incremental as it builds on prior Winograd-based approaches.

The paper tackles the computational inefficiency of large kernel convolutions in CNNs by proposing a nested Winograd algorithm, which reduces the total number of multiplications by 1.4 to 10.5 times compared to existing linear decomposition methods for kernel sizes from 4x4 to 31x31.

Recent literature has shown that convolutional neural networks (CNNs) with large kernels outperform vision transformers (ViTs) and CNNs with stacked small kernels in many computer vision tasks, such as object detection and image restoration. The Winograd transformation helps reduce the number of repetitive multiplications in convolution and is widely supported by many commercial AI processors. Researchers have proposed accelerating large kernel convolutions by linearly decomposing them into many small kernel convolutions and then sequentially accelerating each small kernel convolution with the Winograd algorithm. This work proposes a nested Winograd algorithm that iteratively decomposes a large kernel convolution into small kernel convolutions and proves it to be more effective than the linear decomposition Winograd transformation algorithm. Experiments show that compared to the linear decomposition Winograd algorithm, the proposed algorithm reduces the total number of multiplications by 1.4 to 10.5 times for computing 4x4 to 31x31 convolutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes