LGJun 3, 2025

BuildingBRep-11K: Precise Multi-Storey B-Rep Building Solids with Rich Layout Metadata

arXiv:2506.15718v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This provides a dataset for researchers in AI and architecture to train models for building generation and analysis, though it is incremental as it focuses on data creation rather than novel methods.

The authors tackled the lack of large, clean, and richly annotated datasets for automatic generation of building-scale 3-D objects by introducing BuildingBRep-11K, a collection of 11,978 multi-storey buildings with precise B-rep solids and metadata, and demonstrated its learnability through PointNet baselines achieving metrics like 0.37-storey MAE and 54% accuracy in defect detection.

With the rise of artificial intelligence, the automatic generation of building-scale 3-D objects has become an active research topic, yet training such models still demands large, clean and richly annotated datasets. We introduce BuildingBRep-11K, a collection of 11 978 multi-storey (2-10 floors) buildings (about 10 GB) produced by a shape-grammar-driven pipeline that encodes established building-design principles. Every sample consists of a geometrically exact B-rep solid-covering floors, walls, slabs and rule-based openings-together with a fast-loading .npy metadata file that records detailed per-floor parameters. The generator incorporates constraints on spatial scale, daylight optimisation and interior layout, and the resulting objects pass multi-stage filters that remove Boolean failures, undersized rooms and extreme aspect ratios, ensuring compliance with architectural standards. To verify the dataset's learnability we trained two lightweight PointNet baselines. (i) Multi-attribute regression. A single encoder predicts storey count, total rooms, per-storey vector and mean room area from a 4 000-point cloud. On 100 unseen buildings it attains 0.37-storey MAE (87 \% within $\pm1$), 5.7-room MAE, and 3.2 m$^2$ MAE on mean area. (ii) Defect detection. With the same backbone we classify GOOD versus DEFECT; on a balanced 100-model set the network reaches 54 \% accuracy, recalling 82 \% of true defects at 53 \% precision (41 TP, 9 FN, 37 FP, 13 TN). These pilots show that BuildingBRep-11K is learnable yet non-trivial for both geometric regression and topological quality assessment

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes