CVAIIVMLDec 1, 2022

From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets

arXiv:2212.00394v31 citationsh-index: 35
Originality Incremental advance
AI Analysis

This work addresses the issue of shift invariance in CNNs for computer vision tasks, offering a more efficient and accurate solution, though it is incremental as it builds on existing CNN architectures.

The paper tackles the problem of improving shift invariance and prediction accuracy in CNNs by replacing the first-layer combination of real-valued convolutions and max pooling with complex-valued convolutions and modulus, constrained to Gabor-like filters. This approach achieves superior accuracy on ImageNet and CIFAR-10 classification tasks compared to prior methods, with lower computational cost and memory footprint.

We propose a novel method to increase shift invariance and prediction accuracy in convolutional neural networks. Specifically, we replace the first-layer combination "real-valued convolutions + max pooling" (RMax) by "complex-valued convolutions + modulus" (CMod), which is stable to translations, or shifts. To justify our approach, we claim that CMod and RMax produce comparable outputs when the convolution kernel is band-pass and oriented (Gabor-like filter). In this context, CMod can therefore be considered as a stable alternative to RMax. To enforce this property, we constrain the convolution kernels to adopt such a Gabor-like structure. The corresponding architecture is called mathematical twin, because it employs a well-defined mathematical operator to mimic the behavior of the original, freely-trained model. Our approach achieves superior accuracy on ImageNet and CIFAR-10 classification tasks, compared to prior methods based on low-pass filtering. Arguably, our approach's emphasis on retaining high-frequency details contributes to a better balance between shift invariance and information preservation, resulting in improved performance. Furthermore, it has a lower computational cost and memory footprint than concurrent work, making it a promising solution for practical implementation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes