CVMay 22

GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

arXiv:2605.2388893.7
Predicted impact top 15% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For 3D vision researchers, this work enables high-fidelity, editable scene-scale reconstruction by generalizing object-level generative priors to multi-view indoor environments.

GenRecon achieves high-fidelity 3D scene reconstruction from multi-view RGB images by coupling reconstruction with a generative 3D prior (Trellis.2), outperforming cutting-edge methods by 16%.

We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes