CVMay 9, 2019

D2-Net: A Trainable CNN for Joint Detection and Description of Local Features

arXiv:1905.03561v1892 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of robust image matching and localization for computer vision applications, representing a novel hybrid approach rather than a foundational breakthrough.

The paper tackles the problem of finding reliable pixel-level correspondences under difficult imaging conditions by proposing D2-Net, a single CNN that jointly performs dense feature description and detection, resulting in state-of-the-art performance on the Aachen Day-Night and InLoc localization benchmarks.

In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes