CV IVNov 15, 2022

DCT Perceptron Layer: A Transform Domain Approach for Convolution Layer

Hongyi Pan, Xin Zhu, Salih Atici, Ahmet Enis Cetin

arXiv:2211.08577v14.87 citationsh-index: 37

Originality Incremental advance

AI Analysis

This addresses efficiency issues in deep learning for computer vision, offering a domain-specific improvement that is incremental in nature.

The paper tackles the problem of reducing computational cost in convolutional neural networks by proposing a DCT-perceptron layer to replace standard Conv2D layers in ResNet, achieving comparable accuracy on CIFAR-10 and ImageNet-1K while significantly reducing parameters and multiplications.

In this paper, we propose a novel Discrete Cosine Transform (DCT)-based neural network layer which we call DCT-perceptron to replace the $3\times3$ Conv2D layers in the Residual neural Network (ResNet). Convolutional filtering operations are performed in the DCT domain using element-wise multiplications by taking advantage of the Fourier and DCT Convolution theorems. A trainable soft-thresholding layer is used as the nonlinearity in the DCT perceptron. Compared to ResNet's Conv2D layer which is spatial-agnostic and channel-specific, the proposed layer is location-specific and channel-specific. The DCT-perceptron layer reduces the number of parameters and multiplications significantly while maintaining comparable accuracy results of regular ResNets in CIFAR-10 and ImageNet-1K. Moreover, the DCT-perceptron layer can be inserted with a batch normalization layer before the global average pooling layer in the conventional ResNets as an additional layer to improve classification accuracy.

View on arXiv PDF

Similar