CVMay 21, 2025

The P$^3$ dataset: Pixels, Points and Polygons for Multimodal Building Vectorization

arXiv:2505.15379v1h-index: 30Has CodeIEEE Trans robot
Originality Synthesis-oriented
AI Analysis

This provides a multimodal benchmark for building vectorization, addressing a domain-specific problem in geospatial analysis, but it is incremental as it builds on existing datasets by adding LiDAR.

The authors tackled building vectorization by creating the P$^3$ dataset, which combines aerial LiDAR, imagery, and vector outlines across three continents, and showed that LiDAR improves polygon prediction accuracy and geometric quality in hybrid and end-to-end frameworks.

We present the P$^3$ dataset, a large-scale multimodal benchmark for building vectorization, constructed from aerial LiDAR point clouds, high-resolution aerial imagery, and vectorized 2D building outlines, collected across three continents. The dataset contains over 10 billion LiDAR points with decimeter-level accuracy and RGB images at a ground sampling distance of 25 centimeter. While many existing datasets primarily focus on the image modality, P$^3$ offers a complementary perspective by also incorporating dense 3D information. We demonstrate that LiDAR point clouds serve as a robust modality for predicting building polygons, both in hybrid and end-to-end learning frameworks. Moreover, fusing aerial LiDAR and imagery further improves accuracy and geometric quality of predicted polygons. The P$^3$ dataset is publicly available, along with code and pretrained weights of three state-of-the-art models for building polygon prediction at https://github.com/raphaelsulzer/PixelsPointsPolygons .

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes