CVJun 2

GARDEN: Gravity-Aligned Reconstruction of Disentangled ENvironments from RGB images

arXiv:2606.0392119.1h-index: 4
Predicted impact top 26% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in 3D scene reconstruction and simulation, GARDEN provides a method to produce physically structured environments without CAD retrieval, addressing the bottleneck of monolithic scene representations.

GARDEN converts multi-view RGB images into simulation-ready 3D environments by using gravity as a physical prior to align reconstructions, recover object-centric rigid meshes with accurate 6-DoF placement, and remove duplicate object geometry from the background. It improves object placement reliability, disentanglement quality, and rendering-simulation efficiency over retrieval-based baselines.

Converting multi-view RGB observations into simulation-ready 3D environments remains challenging because current reconstruction pipelines produce monolithic scene representations without explicit physical structure. They are typically defined up to an arbitrary global rotation and entangle rigid foreground objects with background geometry, which hinders stable physical interaction. Existing solutions often recover interactivity by replacing reconstructed objects with retrieved CAD assets, but this introduces a slow retrieval-and-replacement stage and weakens scene-specific geometric fidelity. We propose GARDEN, an RGB-only framework that reformulates reconstruction as physically-grounded scene factorization and outputs a structured hybrid scene representation. The key idea is to use gravity as a universal physical prior: we first align the reconstruction to a unified Gravity-View frame to resolve gauge ambiguity, then recover object-centric rigid meshes with accurate 6-DoF placement, and finally remove duplicate object geometry from the background through conditional 3D point classification. The resulting representation combines explicit rigid bodies with a decoupled background, enabling direct physics simulation while preserving visual realism. Experiments on both simulated and real multi-view scenes show that GARDEN improves object placement reliability, disentanglement quality, and rendering-simulation efficiency compared with retrieval-based baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes