CVJul 2, 2024

Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition

arXiv:2407.02422v126 citationsh-index: 4Has Code
AI Analysis

This work addresses a specific bottleneck in VPR for localization and mapping systems, offering incremental improvements in benchmark performance.

The paper tackles the problem of geographic distance sensitivity in Visual Place Recognition (VPR) embeddings, which leads to incorrect sorting of top-k retrievals and reduces recall. By proposing a novel mining strategy called CliqueMining, it boosts sensitivity at small distances, raising recall@1 from 75% to 82% on MSLS Challenge and from 76% to 90% on Nordland.

Visual Place Recognition (VPR) plays a critical role in many localization and mapping pipelines. It consists of retrieving the closest sample to a query image, in a certain embedding space, from a database of geotagged references. The image embedding is learned to effectively describe a place despite variations in visual appearance, viewpoint, and geometric changes. In this work, we formulate how limitations in the Geographic Distance Sensitivity of current VPR embeddings result in a high probability of incorrectly sorting the top-k retrievals, negatively impacting the recall. In order to address this issue in single-stage VPR, we propose a novel mining strategy, CliqueMining, that selects positive and negative examples by sampling cliques from a graph of visually similar images. Our approach boosts the sensitivity of VPR embeddings at small distance ranges, significantly improving the state of the art on relevant benchmarks. In particular, we raise recall@1 from 75% to 82% in MSLS Challenge, and from 76% to 90% in Nordland. Models and code are available at https://github.com/serizba/cliquemining.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes