CVMar 22, 2025

EMPLACE: Self-Supervised Urban Scene Change Detection

arXiv:2503.17716v18 citationsh-index: 16Has CodeAAAI
Originality Incremental advance
AI Analysis

This addresses the problem of labor-intensive labeling and limited datasets in urban change detection for city planners and researchers, though it is incremental as it builds on existing self-supervised and dataset expansion approaches.

The paper tackles urban scene change detection by introducing AC-1M, the largest dataset with over 1.1M images, and EMPLACE, a self-supervised method using an adaptive triplet loss, which outperforms state-of-the-art methods in fine-tuning and zero-shot settings.

Urban change is a constant process that influences the perception of neighbourhoods and the lives of the people within them. The field of Urban Scene Change Detection (USCD) aims to capture changes in street scenes using computer vision and can help raise awareness of changes that make it possible to better understand the city and its residents. Traditionally, the field of USCD has used supervised methods with small scale datasets. This constrains methods when applied to new cities, as it requires labour-intensive labeling processes and forces a priori definitions of relevant change. In this paper we introduce AC-1M the largest USCD dataset by far of over 1.1M images, together with EMPLACE, a self-supervising method to train a Vision Transformer using our adaptive triplet loss. We show EMPLACE outperforms SOTA methods both as a pre-training method for linear fine-tuning as well as a zero-shot setting. Lastly, in a case study of Amsterdam, we show that we are able to detect both small and large changes throughout the city and that changes uncovered by EMPLACE, depending on size, correlate with housing prices - which in turn is indicative of inequity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes