CVMay 7

Text-to-CAD Retrieval: a Strong Baseline

arXiv:2605.0557215.8h-index: 14
Predicted impact top 60% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For engineers and designers needing to retrieve CAD models from large databases using natural language, this work addresses an underexplored task with a practical benchmark and baseline method.

The paper introduces text-to-CAD retrieval as a new cross-modal task and proposes a unified framework that learns multi-modal CAD embeddings from procedural sequences and point clouds, achieving efficient retrieval. The framework serves as a strong baseline, laying the foundation for downstream CAD generation.

Text-based retrieval of Computer-Aided Design (CAD) models is a critical yet underexplored task for the reuse of legacy industrial designs. Existing CAD repositories are typically searched using filenames or directories, which limits the efficiency, scalability, and accuracy of design retrieval. In this paper, we formally introduce text-to-CAD retrieval as a new cross-modal retrieval task, aiming to retrieve semantically relevant CAD models from large-scale databases given natural language queries. Leveraging paired text-CAD annotations from the Text2CAD dataset, we establish a practical benchmark for this task. To achieve text-based retrieval, we propose a unified framework that learns multi-modal CAD embeddings from both procedural sequences and geometric point clouds. Specifically, a sequence encoder captures the construction logic of CAD models, while a point encoder extracts explicit geometric features. A text encoder is used to learn semantic representations of textual queries. During training, we introduce a novel feature decoder that reconstructs masked sequence features via cross-attention with text and point features, encouraging implicit multi-modal alignment. At inference time, we remove this auxiliary decoder to enable efficient retrieval using concatenated sequence-point features. Our framework serves as a strong baseline for text-to-CAD retrieval and lays the foundation for downstream CAD generation paradigms, such as retrieval-augmented generation. The source code will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes