ROMay 17

Real-to-Sim for Highly Cluttered Environments via Physics-Consistent Inter-Object Reasoning

arXiv:2602.1263374.21 citationsh-index: 6
AI Analysis

For roboticists needing reliable simulation for manipulation in cluttered scenes, this work addresses the critical bottleneck of physical inconsistency in standard perception pipelines.

This work tackles the problem of reconstructing physically valid 3D scenes from single-view RGB-D data for robotic manipulation in cluttered environments. The proposed physics-constrained pipeline achieves high physical fidelity, enabling stable contact-rich manipulation in both simulation and real-world settings.

Reconstructing physically valid 3D scenes from single-view observations is a prerequisite for bridging the gap between visual perception and robotic control. However, in scenarios requiring precise contact reasoning, such as robotic manipulation in highly cluttered environments, geometric fidelity alone is insufficient. Standard perception pipelines often neglect physical constraints, resulting in invalid states, e.g., floating objects or severe inter-penetration, rendering downstream simulation unreliable. To address these limitations, we propose a novel physics-constrained Real-to-Sim pipeline that reconstructs physically consistent 3D scenes from single-view RGB-D data. Central to our approach is a differentiable optimization pipeline that explicitly models spatial dependencies via a contact graph, jointly refining object poses and physical properties through differentiable rigid-body simulation. Extensive evaluations in both simulation and real-world settings demonstrate that our reconstructed scenes achieve high physical fidelity and faithfully replicate real-world contact dynamics, enabling stable and reliable contact-rich manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes