Matching Disparate Image Pairs Using Shape-Aware ConvNets
This addresses a challenging computer vision problem for applications like image registration and object recognition, but it is incremental as it builds on existing graph matching and shape representation methods.
The paper tackles the problem of matching disparate image pairs with strong affine variations, occlusion, and illumination changes by proposing an end-to-end ConvNet architecture that combines local features and high-level shape cues. It achieves state-of-the-art results in both coarse shape matching and fine point-wise correspondence determination.
An end-to-end trainable ConvNet architecture, that learns to harness the power of shape representation for matching disparate image pairs, is proposed. Disparate image pairs are deemed those that exhibit strong affine variations in scale, viewpoint and projection parameters accompanied by the presence of partial or complete occlusion of objects and extreme variations in ambient illumination. Under these challenging conditions, neither local nor global feature-based image matching methods, when used in isolation, have been observed to be effective. The proposed correspondence determination scheme for matching disparate images exploits high-level shape cues that are derived from low-level local feature descriptors, thus combining the best of both worlds. A graph-based representation for the disparate image pair is generated by constructing an affinity matrix that embeds the distances between feature points in two images, thus modeling the correspondence determination problem as one of graph matching. The eigenspectrum of the affinity matrix, i.e., the learned global shape representation, is then used to further regress the transformation or homography that defines the correspondence between the source image and target image. The proposed scheme is shown to yield state-of-the-art results for both, coarse-level shape matching as well as fine point-wise correspondence determination.