CVNov 7, 2023

Supervised domain adaptation for building extraction from off-nadir aerial images

arXiv:2311.03867v21 citationsh-index: 7
Originality Incremental advance
AI Analysis

It addresses building extraction for urban planning by improving accuracy in noisy datasets, though it appears incremental as it builds on existing domain adaptation and encoder-decoder approaches.

This paper tackles the problem of building extraction from off-nadir aerial images, where misalignment between labels and images reduces accuracy, by proposing a supervised domain adaptation method using encoder-decoder networks, which achieved F1 scores up to 0.943 for low-rise buildings, outperforming existing methods like knowledge distillation and deep mutual learning.

Building extraction $-$ needed for inventory management and planning of urban environment $-$ is affected by the misalignment between labels and off-nadir source imagery in training data. Teacher-Student learning of noise-tolerant convolutional neural networks (CNNs) is the existing solution, but the Student networks typically have lower accuracy and cannot surpass the Teacher's performance. This paper proposes a supervised domain adaptation (SDA) of encoder-decoder networks (EDNs) between noisy and clean datasets to tackle the problem. EDNs are configured with high-performing lightweight encoders such as EfficientNet, ResNeSt, and MobileViT. The proposed method is compared against the existing Teacher-Student learning methods like knowledge distillation (KD) and deep mutual learning (DML) with three newly developed datasets. The methods are evaluated for different urban buildings (low-rise, mid-rise, high-rise, and skyscrapers), where misalignment increases with the increase in building height and spatial resolution. For a robust experimental design, 43 lightweight CNNs, five optimisers, nine loss functions, and seven EDNs are benchmarked to obtain the best-performing EDN for SDA. The SDA of the best-performing EDN from our study significantly outperformed KD and DML with up to 0.943, 0.868, 0.912, and 0.697 F1 scores in the low-rise, mid-rise, high-rise, and skyscrapers respectively. The proposed method and the experimental findings will be beneficial in training robust CNNs for building extraction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes