CyCNN: A Rotation Invariant CNN using Polar Mapping and Cylindrical Convolution Layers
This addresses rotation invariance in image classification for computer vision applications, representing a novel method for a known bottleneck rather than a foundational breakthrough.
The paper tackles the problem of CNNs lacking rotation invariance in image classification by proposing CyCNN, which uses polar mapping to convert rotation to translation and cylindrical convolutional layers. On rotated MNIST, CIFAR-10, and SVHN datasets without data augmentation, CyCNN significantly improves classification accuracies compared to conventional CNNs.
Deep Convolutional Neural Networks (CNNs) are empirically known to be invariant to moderate translation but not to rotation in image classification. This paper proposes a deep CNN model, called CyCNN, which exploits polar mapping of input images to convert rotation to translation. To deal with the cylindrical property of the polar coordinates, we replace convolution layers in conventional CNNs to cylindrical convolutional (CyConv) layers. A CyConv layer exploits the cylindrically sliding windows (CSW) mechanism that vertically extends the input-image receptive fields of boundary units in a convolutional layer. We evaluate CyCNN and conventional CNN models for classification tasks on rotated MNIST, CIFAR-10, and SVHN datasets. We show that if there is no data augmentation during training, CyCNN significantly improves classification accuracies when compared to conventional CNN models. Our implementation of CyCNN is publicly available on https://github.com/mcrl/CyCNN.