CV LGMar 1, 2023

Empowering Networks With Scale and Rotation Equivariance Using A Similarity Convolution

arXiv:2303.00326v15.910 citationsh-index: 52

Originality Incremental advance

AI Analysis

This work addresses a fundamental limitation in CNNs for computer vision, potentially enhancing generalization in applications like image recognition, though it appears incremental as it builds on existing equivariance concepts.

The paper tackled the lack of rotation and scaling equivariance in CNNs, which limits generalization, by proposing a similarity convolution method that achieves simultaneous equivariance to translation, rotation, and scaling with similar efficiency and minimal added parameters, demonstrating robustness and improved generalization on scaled and rotated inputs in image classification tasks.

The translational equivariant nature of Convolutional Neural Networks (CNNs) is a reason for its great success in computer vision. However, networks do not enjoy more general equivariance properties such as rotation or scaling, ultimately limiting their generalization performance. To address this limitation, we devise a method that endows CNNs with simultaneous equivariance with respect to translation, rotation, and scaling. Our approach defines a convolution-like operation and ensures equivariance based on our proposed scalable Fourier-Argand representation. The method maintains similar efficiency as a traditional network and hardly introduces any additional learnable parameters, since it does not face the computational issue that often occurs in group-convolution operators. We validate the efficacy of our approach in the image classification task, demonstrating its robustness and the generalization ability to both scaled and rotated inputs.

View on arXiv PDF

Similar