CVSep 19, 2017

Look Wider to Match Image Patches with Convolutional Neural Networks

arXiv:1709.06248v1105 citations
Originality Incremental advance
AI Analysis

This addresses stereo matching for computer vision applications, offering a robust solution to artifacts in weak textures and other challenges, though it is incremental in improving existing methods.

The paper tackles the problem of stereo matching by proposing a novel cost function that intelligently uses information from a large window without losing resolution, achieving near-peak performance on the Middlebury benchmark.

When a human matches two images, the viewer has a natural tendency to view the wide area around the target pixel to obtain clues of right correspondence. However, designing a matching cost function that works on a large window in the same way is difficult. The cost function is typically not intelligent enough to discard the information irrelevant to the target pixel, resulting in undesirable artifacts. In this paper, we propose a novel learn a stereo matching cost with a large-sized window. Unlike conventional pooling layers with strides, the proposed per-pixel pyramid-pooling layer can cover a large area without a loss of resolution and detail. Therefore, the learned matching cost function can successfully utilize the information from a large area without introducing the fattening effect. The proposed method is robust despite the presence of weak textures, depth discontinuity, illumination, and exposure difference. The proposed method achieves near-peak performance on the Middlebury benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes