BuildingWorld: A Structured 3D Building Dataset for Urban Foundation Models
This dataset addresses the need for globally representative 3D building data for urban foundation models, enabling applications like energy modeling and autonomous navigation, but it is incremental as it primarily provides new data rather than a novel method.
The authors tackled the problem of limited architectural diversity in 3D building datasets, which undermines the generalizability of learning-based urban models, by presenting BuildingWorld—a comprehensive dataset with about five million LOD2 building models from diverse global regions, accompanied by real and simulated LiDAR point clouds.
As digital twins become central to the transformation of modern cities, accurate and structured 3D building models emerge as a key enabler of high-fidelity, updatable urban representations. These models underpin diverse applications including energy modeling, urban planning, autonomous navigation, and real-time reasoning. Despite recent advances in 3D urban modeling, most learning-based models are trained on building datasets with limited architectural diversity, which significantly undermines their generalizability across heterogeneous urban environments. To address this limitation, we present BuildingWorld, a comprehensive and structured 3D building dataset designed to bridge the gap in stylistic diversity. It encompasses buildings from geographically and architecturally diverse regions -- including North America, Europe, Asia, Africa, and Oceania -- offering a globally representative dataset for urban-scale foundation modeling and analysis. Specifically, BuildingWorld provides about five million LOD2 building models collected from diverse sources, accompanied by real and simulated airborne LiDAR point clouds. This enables comprehensive research on 3D building reconstruction, detection and segmentation. Cyber City, a virtual city model, is introduced to enable the generation of unlimited training data with customized and structurally diverse point cloud distributions. Furthermore, we provide standardized evaluation metrics tailored for building reconstruction, aiming to facilitate the training, evaluation, and comparison of large-scale vision models and foundation models in structured 3D urban environments.