CVAug 21, 2020

Single-Image Depth Prediction Makes Feature Matching Easier

arXiv:2008.09497v126 citations
AI Analysis

This work addresses the robustness of feature matching in computer vision applications, offering a practical enhancement that improves matching without heavy prerequisites, though it is incremental as it builds on existing depth prediction and feature extraction methods.

The paper tackles the problem of local feature matching in 3D re-localization and multi-view reconstruction by using CNN-based depth predictions from single RGB images to pre-warp images and rectify perspective distortions, resulting in significantly enhanced SIFT and BRISK features and enabling more good matches, even under challenging conditions like opposite camera directions.

Good local features improve the robustness of many 3D re-localization and multi-view reconstruction pipelines. The problem is that viewing angle and distance severely impact the recognizability of a local feature. Attempts to improve appearance invariance by choosing better local feature points or by leveraging outside information, have come with pre-requisites that made some of them impractical. In this paper, we propose a surprisingly effective enhancement to local feature extraction, which improves matching. We show that CNN-based depths inferred from single RGB images are quite helpful, despite their flaws. They allow us to pre-warp images and rectify perspective distortions, to significantly enhance SIFT and BRISK features, enabling more good matches, even when cameras are looking at the same scene but in opposite directions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes