CVNov 25, 2024

GeoFormer: A Multi-Polygon Segmentation Transformer

arXiv:2411.16616v12 citationsh-index: 23BMVC
Originality Incremental advance
AI Analysis

This provides a more efficient alternative to existing multi-loss approaches for building vectorization in remote sensing, though it appears incremental as it adapts transformers to a specific domain.

The paper tackles the problem of generating scale-invariant building shapes from satellite imagery by introducing GeoFormer, an auto-regressive transformer that learns to produce multi-polygons end-to-end, outperforming prior methods.

In remote sensing there exists a common need for learning scale invariant shapes of objects like buildings. Prior works relies on tweaking multiple loss functions to convert segmentation maps into the final scale invariant representation, necessitating arduous design and optimization. For this purpose we introduce the GeoFormer, a novel architecture which presents a remedy to the said challenges, learning to generate multipolygons end-to-end. By modeling keypoints as spatially dependent tokens in an auto-regressive manner, the GeoFormer outperforms existing works in delineating building objects from satellite imagery. We evaluate the robustness of the GeoFormer against former methods through a variety of parameter ablations and highlight the advantages of optimizing a single likelihood function. Our study presents the first successful application of auto-regressive transformer models for multi-polygon predictions in remote sensing, suggesting a promising methodological alternative for building vectorization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes