Accurate Shift Invariant Convolutional Neural Networks Using Gaussian-Hermite Moments
This addresses the problem of shift sensitivity in CNNs for computer vision applications, offering a direct improvement without architectural changes.
The paper tackled the lack of shift invariance in CNNs by proposing Gaussian-Hermite Sampling (GHS) as a downsampling strategy, achieving 100% classification consistency under spatial shifts and improved accuracy on datasets like CIFAR-10 and MNIST-rot.
The convolutional neural networks (CNNs) are not inherently shift invariant or equivariant. The downsampling operation, used in CNNs, is one of the key reasons which breaks the shift invariant property of a CNN. Conversely, downsampling operation is important to improve computational efficiency and increase the area of the receptive field for more contextual information. In this work, we propose Gaussian-Hermite Sampling (GHS), a novel downsampling strategy designed to achieve accurate shift invariance. GHS leverages Gaussian-Hermite polynomials to perform shift-consistent sampling, enabling CNN layers to maintain invariance to arbitrary spatial shifts prior to training. When integrated into standard CNN architectures, the proposed method embeds shift invariance directly at the layer level without requiring architectural modifications or additional training procedures. We evaluate the proposed approach on CIFAR-10, CIFAR-100, and MNIST-rot datasets. Experimental results demonstrate that GHS significantly improves shift consistency, achieving 100% classification consistency under spatial shifts, while also improving classification accuracy compared to baseline CNN models.