CVIVNov 15, 2022

DCT Perceptron Layer: A Transform Domain Approach for Convolution Layer

arXiv:2211.08577v17 citationsh-index: 37
Originality Incremental advance
AI Analysis

This addresses efficiency issues in deep learning for computer vision, offering a domain-specific improvement that is incremental in nature.

The paper tackles the problem of reducing computational cost in convolutional neural networks by proposing a DCT-perceptron layer to replace standard Conv2D layers in ResNet, achieving comparable accuracy on CIFAR-10 and ImageNet-1K while significantly reducing parameters and multiplications.

In this paper, we propose a novel Discrete Cosine Transform (DCT)-based neural network layer which we call DCT-perceptron to replace the $3\times3$ Conv2D layers in the Residual neural Network (ResNet). Convolutional filtering operations are performed in the DCT domain using element-wise multiplications by taking advantage of the Fourier and DCT Convolution theorems. A trainable soft-thresholding layer is used as the nonlinearity in the DCT perceptron. Compared to ResNet's Conv2D layer which is spatial-agnostic and channel-specific, the proposed layer is location-specific and channel-specific. The DCT-perceptron layer reduces the number of parameters and multiplications significantly while maintaining comparable accuracy results of regular ResNets in CIFAR-10 and ImageNet-1K. Moreover, the DCT-perceptron layer can be inserted with a batch normalization layer before the global average pooling layer in the conventional ResNets as an additional layer to improve classification accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes