CVMar 28, 2025

ArchCAD-400K: A Large-Scale CAD drawings Dataset and New Baseline for Panoptic Symbol Spotting

arXiv:2503.22346v33 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the need for efficient symbol recognition in architectural CAD drawings, which is critical for engineering applications, but it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of recognizing symbols in architectural CAD drawings by introducing ArchCAD-400K, a large-scale dataset over 26 times larger than existing ones, and a new baseline model, DPSS, which achieved state-of-the-art performance for panoptic symbol spotting.

Recognizing symbols in architectural CAD drawings is critical for various advanced engineering applications. In this paper, we propose a novel CAD data annotation engine that leverages intrinsic attributes from systematically archived CAD drawings to automatically generate high-quality annotations, thus significantly reducing manual labeling efforts. Utilizing this engine, we construct ArchCAD-400K, a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings, making it over 26 times larger than the largest existing CAD dataset. ArchCAD-400K boasts an extended drawing diversity and broader categories, offering line-grained annotations. Furthermore, we present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS). It incorporates an adaptive fusion module to enhance primitive features with complementary image features, achieving state-of-the-art performance and enhanced robustness. Extensive experiments validate the effectiveness of DPSS, demonstrating the value of ArchCAD-400K and its potential to drive innovation in architectural design and construction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes