CVROJun 2

SimuScene: Simulation-Ready Compositional 3D Scene Reconstruction from a Single Image

arXiv:2606.0399494.0
Predicted impact top 9% in CV · last 90 daysOriginality Highly original
AI Analysis

For robotic manipulation, SimuScene solves the problem of unstable 3D scene reconstructions from single images by integrating physics feedback into the reconstruction process.

SimuScene reconstructs simulation-ready 3D scenes from a single image by using a physics engine as a diagnostic tool during shape and layout estimation, correcting interpenetration and floating objects. It achieves state-of-the-art physical stability and geometric alignment, enabling deployment in humanoid control and robot-arm manipulation.

Reconstructing interactive, simulation-ready 3D scenes from a single image is a critical bottleneck for robotic manipulation. While recent single-image lifters recover plausible per-object shapes, composing them yields scenes that collapse under physical simulation due to interpenetrating, hovering, or sinking objects. Existing physics-aware methods address this strictly as a post-hoc layout correction, leaving the underlying geometric errors unresolved. To address this, we introduce SimuScene, a compositional 3D reconstruction pipeline that puts physics in the loop of shape and layout estimation. Rather than using physics merely for layout cleanup, we utilize the physics engine as a diagnostic measurement tool during the generative process itself. By diagnostically simulating reconstructed objects under gravity, we convert penetration and support failures into quantitative correction signals that drive gravity-axis stretching and amodal shape resampling. This physics-informed feedback loop mitigates accumulated reconstruction errors and produces a stable, simulation-ready compositional 3D scene. Extensive experiments demonstrate state-of-the-art performance on physical stability and geometric alignment benchmarks. We further highlight SimuScene's utility by deploying reconstructed environments in humanoid control and robot-arm manipulation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes