CVJan 8, 2022

Decoupling Makes Weakly Supervised Local Feature Better

arXiv:2201.02861v267 citations
AI Analysis

This work addresses the challenge of acquiring densely labeled datasets for local feature methods in computer vision, offering a novel weakly supervised approach that improves performance for tasks like image matching and 3D reconstruction.

The paper tackled the problem of limited performance in weakly supervised local feature learning due to joint detection and description, by proposing a decoupled pipeline and a line-to-window search strategy. The result is PoSFeat, which outperforms previous fully and weakly supervised methods and achieves state-of-the-art performance on various downstream tasks.

Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences. However, since weak supervision cannot distinguish the losses caused by the detection and description steps, directly conducting weakly supervised learning within a joint describe-then-detect pipeline suffers limited performance. In this paper, we propose a decoupled describe-then-detect pipeline tailored for weakly supervised local feature learning. Within our pipeline, the detection step is decoupled from the description step and postponed until discriminative and robust descriptors are learned. In addition, we introduce a line-to-window search strategy to explicitly use the camera pose information for better descriptor learning. Extensive experiments show that our method, namely PoSFeat (Camera Pose Supervised Feature), outperforms previous fully and weakly supervised methods and achieves state-of-the-art performance on a wide range of downstream tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes