Register assisted aggregation for Visual Place Recognition
This addresses the problem of robust place recognition for applications like robotics or autonomous systems, though it appears incremental as it builds on existing feature aggregation techniques.
The paper tackles the challenge of Visual Place Recognition (VPR) under appearance changes by proposing a feature aggregation method that uses registers to preserve useful features like buildings and trees, resulting in improved performance that outperforms state-of-the-art methods.
Visual Place Recognition (VPR) refers to the process of using computer vision to recognize the position of the current query image. Due to the significant changes in appearance caused by season, lighting, and time spans between query images and database images for retrieval, these differences increase the difficulty of place recognition. Previous methods often discarded useless features (such as sky, road, vehicles) while uncontrolled discarding features that help improve recognition accuracy (such as buildings, trees). To preserve these useful features, we propose a new feature aggregation method to address this issue. Specifically, in order to obtain global and local features that contain discriminative place information, we added some registers on top of the original image tokens to assist in model training. After reallocating attention weights, these registers were discarded. The experimental results show that these registers surprisingly separate unstable features from the original image representation and outperform state-of-the-art methods.