CVJun 12, 2023

Scale-Rotation-Equivariant Lie Group Convolution Neural Networks (Lie Group-CNNs)

arXiv:2306.06934v13 citationsh-index: 21
Originality Highly original
AI Analysis

This addresses the problem of improving image classification robustness to geometric transformations for computer vision applications, representing a novel method rather than an incremental advance.

The study tackled the lack of scale-rotation-equivariance in CNNs by proposing a Lie group-CNN, achieving state-of-the-art recognition accuracies of 97.50% on a blood cell dataset and 77.90% on the HAM10000 dataset.

The weight-sharing mechanism of convolutional kernels ensures translation-equivariance of convolution neural networks (CNNs). Recently, rotation-equivariance has been investigated. However, research on scale-equivariance or simultaneous scale-rotation-equivariance is insufficient. This study proposes a Lie group-CNN, which can keep scale-rotation-equivariance for image classification tasks. The Lie group-CNN includes a lifting module, a series of group convolution modules, a global pooling layer, and a classification layer. The lifting module transfers the input image from Euclidean space to Lie group space, and the group convolution is parameterized through a fully connected network using Lie-algebra of Lie-group elements as inputs to achieve scale-rotation-equivariance. The Lie group SIM(2) is utilized to establish the Lie group-CNN with scale-rotation-equivariance. Scale-rotation-equivariance of Lie group-CNN is verified and achieves the best recognition accuracy on the blood cell dataset (97.50%) and the HAM10000 dataset (77.90%) superior to Lie algebra convolution network, dilation convolution, spatial transformer network, and scale-equivariant steerable network. In addition, the generalization ability of the Lie group-CNN on SIM(2) on rotation-equivariance is verified on rotated-MNIST and rotated-CIFAR10, and the robustness of the network is verified on SO(2) and SE(2). Therefore, the Lie group-CNN can successfully extract geometric features and performs equivariant recognition on images with rotation and scale transformations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes