Shaojie Zhou

h-index7
2papers

2 Papers

AIFeb 24
Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

Jia-Rui Lin, Yun-Hong Cai, Xiang-Rui Ni et al.

As the construction industry advances toward digital transformation, BIM (Building Information Modeling)-based design has become a key driver supporting intelligent construction. Despite Large Language Models (LLMs) have shown potential in promoting BIM-based design, the lack of specific datasets and LLM evaluation benchmarks has significantly hindered the performance of LLMs. Therefore, this paper addresses this gap by proposing: 1) an evaluation benchmark for BIM-based design together with corresponding quantitative indicators to evaluate the performance of LLMs, 2) a method for generating textual data from BIM and constructing corresponding BIM-derived datasets for LLM evaluation and fine-tuning, and 3) a fine-tuning strategy to adapt LLMs for BIM-based design. Results demonstrate that the proposed domain-specific benchmark effectively and comprehensively assesses LLM capabilities, highlighting that general LLMs are still incompetent for domain-specific tasks. Meanwhile, with the proposed benchmark and datasets, Qwen-BIM is developed and achieves a 21.0% average increase in G-Eval score compared to the base LLM model. Notably, with only 14B parameters, performance of Qwen-BIM is comparable to that of general LLMs with 671B parameters for BIM-based design tasks. Overall, this study develops the first domain-specific LLM for BIM-based design by introducing a comprehensive benchmark and high-quality dataset, which provide a solid foundation for developing BIM-related LLMs in various fields.

CVDec 26, 2024
Impact of color and mixing proportion of synthetic point clouds on semantic segmentation

Shaojie Zhou, Jia-Rui Lin, Peng Pan et al.

Deep learning (DL)-based point cloud segmentation is essential for understanding built environment. Despite synthetic point clouds (SPC) having the potential to compensate for data shortage, how synthetic color and mixing proportion impact DL-based segmentation remains a long-standing question. Therefore, this paper addresses this question with extensive experiments by introducing: 1) method to generate SPC with real colors and uniform colors from BIM, and 2) enhanced benchmarks for better performance evaluation. Experiments on DL models including PointNet, PointNet++, and DGCNN show that model performance on SPC with real colors outperforms that on SPC with uniform colors by 8.2 % + on both OA and mIoU. Furthermore, a higher than 70 % mixing proportion of SPC usually leads to better performance. And SPC can replace real ones to train a DL model for detecting large and flat building elements. Overall, this paper unveils the performance-improving mechanism of SPC and brings new insights to boost SPC's value (for building large models for point clouds).