CVAILGROApr 29

Reconstruction by Generation: 3D Multi-Object Scene Reconstruction from Sparse Observations

arXiv:2604.2710697.41 citations
Predicted impact top 5% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of accurate 3D scene reconstruction from sparse views for robotics and computer vision, offering a robust solution for heavily occluded and complex scenes.

RecGen introduces a generative framework for reconstructing 3D multi-object scenes from sparse RGB-D observations, achieving state-of-the-art performance on heavily occluded datasets with 30.1% better geometric shape quality, 9.1% better texture reconstruction, and 33.9% better pose estimation than SAM3D while using 80% fewer training meshes.

Accurately reconstructing complex full multi-object scenes from sparse observations remains a core challenge in computer vision and a key step toward scalable and reliable simulation for robotics. In this work, we introduce RecGen, a generative framework for probabilistic joint estimation of object and part shapes, as well as their pose under occlusion and partial visibility from one or multiple RGB-D images. By leveraging compositional synthetic scene generation and strong 3D shape priors, RecGen generalizes across diverse object types and real-world environments. RecGen achieves state-of-the-art performance on complex, heavily occluded datasets, robustly handling severe occlusions, symmetric objects, object parts, and intricate geometry and texture. Despite using nearly 80% fewer training meshes than the previous state of the art SAM3D, RecGen outperforms it by 30.1% in geometric shape quality, 9.1% in texture reconstruction, and 33.9% in pose estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes