CVCLNov 28, 2022

G^3: Geolocation via Guidebook Grounding

CMU
arXiv:2211.15521v112 citationsh-index: 156Has Code
Originality Incremental advance
AI Analysis

This work addresses geolocation for applications like interactive games and location-based services, but it is incremental as it builds on existing methods by incorporating textual guidebooks.

The paper tackles the problem of geolocation by predicting the country where an image was taken, using explicit knowledge from human-written guidebooks to improve accuracy. It achieves a 5% improvement in Top-1 accuracy over a state-of-the-art image-only method.

We demonstrate how language can improve geolocation: the task of predicting the location where an image was taken. Here we study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation. We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations and an associated textual guidebook for GeoGuessr, a popular interactive geolocation game. Our approach predicts a country for each image by attending over the clues automatically extracted from the guidebook. Supervising attention with country-level pseudo labels achieves the best performance. Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy. Our dataset and code can be found at https://github.com/g-luo/geolocation_via_guidebook_grounding.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes