Hyperpixel Flow: Semantic Correspondence with Multi-layer Neural Features
It addresses the problem of establishing visual correspondences for computer vision tasks, with incremental improvements in matching accuracy and efficiency.
The paper tackles semantic correspondence under large intra-class variations by representing images with 'hyperpixels' that select relevant features from multiple CNN layers, achieving state-of-the-art results on three benchmarks and a new dataset with real-time performance.
Establishing visual correspondences under large intra-class variations requires analyzing images at different levels, from features linked to semantics and context to local patterns, while being invariant to instance-specific details. To tackle these challenges, we represent images by "hyperpixels" that leverage a small number of relevant features selected among early to late layers of a convolutional neural network. Taking advantage of the condensed features of hyperpixels, we develop an effective real-time matching algorithm based on Hough geometric voting. The proposed method, hyperpixel flow, sets a new state of the art on three standard benchmarks as well as a new dataset, SPair-71k, which contains a significantly larger number of image pairs than existing datasets, with more accurate and richer annotations for in-depth analysis.