CVNov 6, 2023

Long-Term Invariant Local Features via Implicit Cross-Domain Correspondences

arXiv:2311.03345v1h-index: 191
AI Analysis

This work addresses the challenge of robust visual localization for long-term deployments in varying conditions, representing an incremental improvement over existing methods.

The paper tackles the problem of performance decline in learning-based visual feature extraction networks for localization across long-term visual domain variations, such as seasonal and daytime changes, by proposing a novel data-centric method called Implicit Cross-Domain Correspondences (iCDC) that significantly reduces the performance gap and outperforms existing methods on benchmarks.

Modern learning-based visual feature extraction networks perform well in intra-domain localization, however, their performance significantly declines when image pairs are captured across long-term visual domain variations, such as different seasonal and daytime variations. In this paper, our first contribution is a benchmark to investigate the performance impact of long-term variations on visual localization. We conduct a thorough analysis of the performance of current state-of-the-art feature extraction networks under various domain changes and find a significant performance gap between intra- and cross-domain localization. We investigate different methods to close this gap by improving the supervision of modern feature extractor networks. We propose a novel data-centric method, Implicit Cross-Domain Correspondences (iCDC). iCDC represents the same environment with multiple Neural Radiance Fields, each fitting the scene under individual visual domains. It utilizes the underlying 3D representations to generate accurate correspondences across different long-term visual conditions. Our proposed method enhances cross-domain localization performance, significantly reducing the performance gap. When evaluated on popular long-term localization benchmarks, our trained networks consistently outperform existing methods. This work serves as a substantial stride toward more robust visual localization pipelines for long-term deployments, and opens up research avenues in the development of long-term invariant descriptors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes