Are Local Features All You Need for Cross-Domain Visual Place Recognition?
This work addresses a critical problem for autonomous navigation and robotics by improving place recognition under domain shifts, but it appears incremental as it builds on existing re-ranking techniques with new datasets.
The paper tackles the challenge of cross-domain visual place recognition, where query images differ significantly from the database (e.g., night-time or occluded), by exploring re-ranking methods based on spatial verification with local descriptors. It introduces new datasets and benchmarks, showing that these methods can improve robustness, though specific numerical gains are not detailed in the abstract.
Visual Place Recognition is a task that aims to predict the coordinates of an image (called query) based solely on visual clues. Most commonly, a retrieval approach is adopted, where the query is matched to the most similar images from a large database of geotagged photos, using learned global descriptors. Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods. Examples are heavy illumination changes (e.g. night-time images) or substantial occlusions (e.g. transient objects). In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges, following the intuition that local descriptors are inherently more robust than global features to domain shifts. To this end, we provide a new, comprehensive benchmark on current state of the art models. We also introduce two new demanding datasets with night and occluded queries, to be matched against a city-wide database. Code and datasets are available at https://github.com/gbarbarani/re-ranking-for-VPR.