CVAIApr 29, 2024

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

arXiv:2404.18873v149 citationsh-index: 6Has CodeCVPR
Originality Incremental advance
AI Analysis

This provides a foundational resource for researchers in computer vision to evaluate geolocation algorithms, addressing a key bottleneck in the field.

The authors tackled the lack of standard, large-scale datasets for visual geolocation by introducing OpenStreetView-5M, a dataset of over 5.1 million geo-referenced street view images covering 225 countries, which enabled benchmarking of state-of-the-art methods with strict train/test separation.

Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms. Yet, the absence of standard, large-scale, open-access datasets with reliably localizable images has limited its potential. To address this issue, we introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 million geo-referenced street view images, covering 225 countries and territories. In contrast to existing benchmarks, we enforce a strict train/test separation, allowing us to evaluate the relevance of learned geographical features beyond mere memorization. To demonstrate the utility of our dataset, we conduct an extensive benchmark of various state-of-the-art image encoders, spatial representations, and training strategies. All associated codes and models can be found at https://github.com/gastruc/osv5m.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes