CVAIJun 3

UniCAD: A Unified Benchmark and Universal Model for Multi-Modal Multi-Task CAD

arXiv:2606.0505882.2
Predicted impact top 25% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work provides a comprehensive benchmark and a single model for multiple CAD tasks, addressing the lack of unified evaluation in the field.

UniCAD introduces a unified benchmark for multi-modal CAD learning and a universal model (UniCAD-MLLM) that achieves state-of-the-art performance across point-to-CAD reconstruction, text/image-to-CAD generation, and CAD question answering, outperforming existing baselines.

Computer-Aided Design (CAD) underpins modern engineering and manufacturing by enabling the creation of precise, editable 3D models. However, CAD research typically studies tasks in isolation, and multi-modal, multi-task learning for CAD is hindered by the absence of a unified benchmark. To address this gap, we introduce UniCAD, a comprehensive benchmark for multi-modal CAD learning that covers point-to-CAD reconstruction, text/image-to-CAD generation, and CAD question answering across diverse input modalities. Alongside the benchmark, we present UniCAD-MLLM, a universal multi-modal large language model that ingests text, images, sketches, and point clouds and performs these heterogeneous tasks in an end-to-end fashion within a single framework. Extensive experiments on the UniCAD and Fusion360 benchmarks demonstrate that UniCAD-MLLM achieves state-of-the-art performance across all tasks, outperforming existing task-specific and multi-task baselines. We will release the dataset, code, and pretrained models to accelerate future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes