CVApr 15, 2020

Contextual Pyramid Attention Network for Building Segmentation in Aerial Imagery

arXiv:2004.07018v11 citations
AI Analysis

This work addresses building extraction for applications like urban planning and disaster management, representing an incremental improvement in semantic segmentation for remote sensing.

The paper tackled building segmentation in aerial imagery by proposing a contextual pyramid attention network to capture long-range dependencies, achieving state-of-the-art performance with a 1.8-point improvement in IoU over current methods and 12.6 points higher than baselines.

Building extraction from aerial images has several applications in problems such as urban planning, change detection, and disaster management. With the increasing availability of data, Convolutional Neural Networks (CNNs) for semantic segmentation of remote sensing imagery has improved significantly in recent years. However, convolutions operate in local neighborhoods and fail to capture non-local features that are essential in semantic understanding of aerial images. In this work, we propose to improve building segmentation of different sizes by capturing long-range dependencies using contextual pyramid attention (CPA). The pathways process the input at multiple scales efficiently and combine them in a weighted manner, similar to an ensemble model. The proposed method obtains state-of-the-art performance on the Inria Aerial Image Labelling Dataset with minimal computation costs. Our method improves 1.8 points over current state-of-the-art methods and 12.6 points higher than existing baselines on the Intersection over Union (IoU) metric without any post-processing. Code and models will be made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes