CVLGNov 20, 2020

Assessing out-of-domain generalization for robust building damage detection

arXiv:2011.10328v125 citationsHas Code
AI Analysis

This work highlights a critical limitation for disaster response organizations relying on automated building damage detection, as current evaluation methods do not reflect real-world performance.

This paper addresses the problem of out-of-domain generalization for building damage detection models using satellite imagery. It finds that state-of-the-art models exhibit a substantial generalization gap, with performance dropping when evaluated on new disasters not seen during training, and that in-distribution performance is not predictive of out-of-domain performance.

An important step for limiting the negative impact of natural disasters is rapid damage assessment after a disaster occurred. For instance, building damage detection can be automated by applying computer vision techniques to satellite imagery. Such models operate in a multi-domain setting: every disaster is inherently different (new geolocation, unique circumstances), and models must be robust to a shift in distribution between disaster imagery available for training and the images of the new event. Accordingly, estimating real-world performance requires an out-of-domain (OOD) test set. However, building damage detection models have so far been evaluated mostly in the simpler yet unrealistic in-distribution (IID) test setting. Here we argue that future work should focus on the OOD regime instead. We assess OOD performance of two competitive damage detection models and find that existing state-of-the-art models show a substantial generalization gap: their performance drops when evaluated OOD on new disasters not used during training. Moreover, IID performance is not predictive of OOD performance, rendering current benchmarks uninformative about real-world performance. Code and model weights are available at https://github.com/ecker-lab/robust-bdd.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes