CVIVMay 12

GATA2Floor: Graph attention for floor counting in street-view facades

arXiv:2605.1186341.5
Predicted impact top 77% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For urban analytics and emergency planning, this work provides a method to automatically reason about building facades, but the results are incremental as no concrete numbers or comparisons are given.

GATA2Floor uses graph attention to predict floor counts from street-view facade images, achieving robust performance without labeled data by leveraging self-supervised features and vision-language scoring.

Automated analysis of building facades from street-level imagery has great potential for urban analytics, energy assessment, and emergency planning. However, it requires reasoning over spatially arranged elements rather than solely isolated detections. In this work, we model each facade as a graph over window/door detections with a vertical prior on edges. Additionally, we introduce GATA2Floor, a multi-head Graph Attention v2 (GATv2) based model that predicts the global floor count of a building and, via learnable cross-attention queries, softly assigns elements to latent floor slots, yielding interpretable outputs and robustness to irregular designs. To mitigate the lack of labeled datasets, we demonstrate that the proposed graph-based reasoning can be applied without annotations by leveraging a lightweight label-free proposal mechanism based on self-supervised features and vision-language scoring. Our approach demonstrates the value of graph-attention-based relational reasoning for facade understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes