CVFeb 29, 2024

DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly

Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari, Pietro Morerio, Alessio Del Bue

arXiv:2402.19302v121.534 citationsh-index: 35Has CodeCVPR

Originality Highly original

AI Analysis

This work addresses reassembly problems in fields like computer vision and robotics, offering a general solution with significant speed improvements, though it builds on existing diffusion and GNN techniques.

The paper tackles the problem of reassembly tasks across 2D and 3D data by proposing DiffAssemble, a unified graph-diffusion model that achieves state-of-the-art results and runs 11 times faster than previous optimization-based methods.

Reassembly tasks play a fundamental role in many fields and multiple approaches exist to solve specific reassembly problems. In this context, we posit that a general unified model can effectively address them all, irrespective of the input data type (images, 3D, etc.). We introduce DiffAssemble, a Graph Neural Network (GNN)-based architecture that learns to solve reassembly tasks using a diffusion model formulation. Our method treats the elements of a set, whether pieces of 2D patch or 3D object fragments, as nodes of a spatial graph. Training is performed by introducing noise into the position and rotation of the elements and iteratively denoising them to reconstruct the coherent initial pose. DiffAssemble achieves state-of-the-art (SOTA) results in most 2D and 3D reassembly tasks and is the first learning-based approach that solves 2D puzzles for both rotation and translation. Furthermore, we highlight its remarkable reduction in run-time, performing 11 times faster than the quickest optimization-based method for puzzle solving. Code available at https://github.com/IIT-PAVIS/DiffAssemble

View on arXiv PDF Code

Similar