CVJan 29, 2018

Hierarchical Spatial Transformer Network

arXiv:1801.09467v22 citations
AI Analysis

This addresses a key limitation in spatial transformer networks for computer vision applications, though it is incremental by building on existing methods.

The paper tackled the problem of local spatial variance in image deformation by combining approximation theory and optical flow theory to propose a hierarchical convolutional neural network, achieving significant improvements in cluttered MNIST classification and image plane alignment tasks.

Computer vision researchers have been expecting that neural networks have spatial transformation ability to eliminate the interference caused by geometric distortion for a long time. Emergence of spatial transformer network makes dream come true. Spatial transformer network and its variants can handle global displacement well, but lack the ability to deal with local spatial variance. Hence how to achieve a better manner of deformation in the neural network has become a pressing matter of the moment. To address this issue, we analyze the advantages and disadvantages of approximation theory and optical flow theory, then we combine them to propose a novel way to achieve image deformation and implement it with a hierarchical convolutional neural network. This new approach solves for a linear deformation along with an optical flow field to model image deformation. In the experiments of cluttered MNIST handwritten digits classification and image plane alignment, our method outperforms baseline methods by a large margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes