CVApr 30, 2021

Interpretable Semantic Photo Geolocation

arXiv:2104.14995v245 citations
AI Analysis

This addresses the need for interpretability in geolocalization systems for users who require validation of predictions, though it is incremental as it builds on existing CNN-based methods.

The paper tackles the problem of making photo geolocalization models more interpretable by proposing a semantic partitioning method that improves understanding of predictions while achieving state-of-the-art accuracy on benchmark test sets, and introduces a metric to assess the importance of visual concepts for predictions.

Planet-scale photo geolocalization is the complex task of estimating the location depicted in an image solely based on its visual content. Due to the success of convolutional neural networks (CNNs), current approaches achieve super-human performance. However, previous work has exclusively focused on optimizing geolocalization accuracy. Due to the black-box property of deep learning systems, their predictions are difficult to validate for humans. State-of-the-art methods treat the task as a classification problem, where the choice of the classes, that is the partitioning of the world map, is crucial for the performance. In this paper, we present two contributions to improve the interpretability of a geolocalization model: (1) We propose a novel semantic partitioning method which intuitively leads to an improved understanding of the predictions, while achieving state-of-the-art results for geolocational accuracy on benchmark test sets; (2) We introduce a metric to assess the importance of semantic visual concepts for a certain prediction to provide additional interpretable information, which allows for a large-scale analysis of already trained models. Source code and dataset are publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes