CVOct 12, 2024

Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence

arXiv:2410.09533v16 citationsh-index: 9ACCV
Originality Incremental advance
AI Analysis

This work addresses the challenge of matching points from different semantic areas in visual correspondence tasks, such as camera localization and image registration, offering an incremental improvement over existing methods.

The paper tackles the problem of local feature matching in computer vision by incorporating semantic cues from foundation vision models into existing descriptors, achieving a 29% average performance increase in camera localization and comparable accuracy to state-of-the-art matchers like LightGlue and LoFTR.

Visual correspondence is a crucial step in key computer vision tasks, including camera localization, image registration, and structure from motion. The most effective techniques for matching keypoints currently involve using learned sparse or dense matchers, which need pairs of images. These neural networks have a good general understanding of features from both images, but they often struggle to match points from different semantic areas. This paper presents a new method that uses semantic cues from foundation vision model features (like DINOv2) to enhance local feature matching by incorporating semantic reasoning into existing descriptors. Therefore, the learned descriptors do not require image pairs at inference time, allowing feature caching and fast matching using similarity search, unlike learned matchers. We present adapted versions of six existing descriptors, with an average increase in performance of 29% in camera localization, with comparable accuracy to existing matchers as LightGlue and LoFTR in two existing benchmarks. Both code and trained models are available at https://www.verlab.dcc.ufmg.br/descriptors/reasoning_accv24

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes