CVLGAug 23, 2020

Holistic Multi-View Building Analysis in the Wild with Projection Pooling

arXiv:2008.10041v37 citations
AI Analysis

This work addresses remote building analysis for urban planning and monitoring, but it is incremental as it builds on existing datasets and methods with a novel layer.

The paper tackles the problem of fine-grained building attribute classification from multi-view images by introducing a new projection pooling layer that creates a unified top-view representation, improving classification accuracy compared to baseline models.

We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes