AmsterTime: A Visual Place Recognition Benchmark Dataset for Severe Domain Shift
This provides a benchmark for researchers in computer vision to test VPR methods under challenging domain shifts, though it is incremental as it focuses on dataset creation rather than novel algorithms.
The authors tackled the problem of visual place recognition under severe domain shift by introducing the AmsterTime dataset, which includes 2,500 image pairs from Amsterdam with variations in cameras and viewpoints, and found that a ResNet-101 model pre-trained on the Landmarks dataset achieved 84% accuracy in verification and 24% in retrieval tasks.
We introduce AmsterTime: a challenging dataset to benchmark visual place recognition (VPR) in presence of a severe domain shift. AmsterTime offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existing benchmark datasets, AmsterTime is directly crowdsourced in a GIS navigation platform (Mapillary). We evaluate various baselines, including non-learning, supervised and self-supervised methods, pre-trained on different relevant datasets, for both verification and retrieval tasks. Our result credits the best accuracy to the ResNet-101 model pre-trained on the Landmarks dataset for both verification and retrieval tasks by 84% and 24%, respectively. Additionally, a subset of Amsterdam landmarks is collected for feature evaluation in a classification task. Classification labels are further used to extract the visual explanations using Grad-CAM for inspection of the learned similar visuals in a deep metric learning models.