CRAG: Can 3D Generative Models Help 3D Assembly?
This work addresses the limitation of existing 3D assembly methods that cannot synthesize missing geometry, which is a significant problem for robotics and computer graphics applications dealing with incomplete 3D data.
This paper tackles 3D assembly by reformulating it as a joint problem of assembly and generation, allowing it to synthesize missing geometry while predicting poses for input parts. The proposed method, CRAG, achieves state-of-the-art performance on in-the-wild objects with diverse geometries, varying part counts, and missing pieces.
Most existing 3D assembly methods treat the problem as pure pose estimation, rearranging observed parts via rigid transformations. In contrast, human assembly naturally couples structural reasoning with holistic shape inference. Inspired by this intuition, we reformulate 3D assembly as a joint problem of assembly and generation. We show that these two processes are mutually reinforcing: assembly provides part-level structural priors for generation, while generation injects holistic shape context that resolves ambiguities in assembly. Unlike prior methods that cannot synthesize missing geometry, we propose CRAG, which simultaneously generates plausible complete shapes and predicts poses for input parts. Extensive experiments demonstrate state-of-the-art performance across in-the-wild objects with diverse geometries, varying part counts, and missing pieces. Our code and models will be released.