CVSep 6, 2017

Polar Transformer Networks

arXiv:1709.01889v3199 citations
Originality Incremental advance
AI Analysis

This addresses the need for more robust image recognition models that can handle geometric transformations like rotation and scaling, though it builds incrementally on existing transformer and coordinate representation ideas.

The paper tackles the problem of limited equivariance in convolutional neural networks by introducing Polar Transformer Networks (PTN), which achieve invariance to translation and equivariance to rotation and scale, resulting in state-of-the-art performance on rotated MNIST and a new SIM2MNIST dataset.

Convolutional neural networks (CNNs) are inherently equivariant to translation. Efforts to embed other forms of equivariance have concentrated solely on rotation. We expand the notion of equivariance in CNNs through the Polar Transformer Network (PTN). PTN combines ideas from the Spatial Transformer Network (STN) and canonical coordinate representations. The result is a network invariant to translation and equivariant to both rotation and scale. PTN is trained end-to-end and composed of three distinct stages: a polar origin predictor, the newly introduced polar transformer module and a classifier. PTN achieves state-of-the-art on rotated MNIST and the newly introduced SIM2MNIST dataset, an MNIST variation obtained by adding clutter and perturbing digits with translation, rotation and scaling. The ideas of PTN are extensible to 3D which we demonstrate through the Cylindrical Transformer Network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes