CVIVSPMar 13, 2023

Multichannel Orthogonal Transform-Based Perceptron Layers for Efficient ResNets

arXiv:2303.06797v311 citationsh-index: 37
Originality Incremental advance
AI Analysis

This work addresses computational efficiency and accuracy improvements in CNNs for image classification, though it appears incremental as it modifies existing ResNet architectures rather than introducing a new paradigm.

The authors tackled the inefficiency of standard convolutional layers in ResNets by proposing transform-based layers using orthogonal transforms like DCT, HT, and BWT, which reduce parameters and multiplications while improving accuracy on ImageNet-1K classification.

In this paper, we propose a set of transform-based neural network layers as an alternative to the $3\times3$ Conv2D layers in Convolutional Neural Networks (CNNs). The proposed layers can be implemented based on orthogonal transforms such as the Discrete Cosine Transform (DCT), Hadamard transform (HT), and biorthogonal Block Wavelet Transform (BWT). Furthermore, by taking advantage of the convolution theorems, convolutional filtering operations are performed in the transform domain using element-wise multiplications. Trainable soft-thresholding layers, that remove noise in the transform domain, bring nonlinearity to the transform domain layers. Compared to the Conv2D layer, which is spatial-agnostic and channel-specific, the proposed layers are location-specific and channel-specific. Moreover, these proposed layers reduce the number of parameters and multiplications significantly while improving the accuracy results of regular ResNets on the ImageNet-1K classification task. Furthermore, they can be inserted with a batch normalization layer before the global average pooling layer in the conventional ResNets as an additional layer to improve classification accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes