MonoPhysics: Estimating Geometry, Appearance, and Physical Parameters from Monocular Videos
Enables inverse physics estimation from a single camera, addressing scale ambiguity and weak coupling in monocular settings for deformable objects.
MonoPhysics jointly estimates geometry, appearance, and physical parameters of deformable objects from monocular videos using differentiable MPM simulation and 3D Gaussian Splatting, outperforming existing monocular baselines and achieving performance comparable to multi-view methods.
Existing inverse physics methods recover physical parameters from multi-view videos, where geometric constraints across views resolve scale and 3D structure. In monocular settings, however, such constraints are absent, leading to severe scale ambiguity, inaccurate geometry, and weak coupling between appearance optimization and physical simulation. We propose MonoPhysics, a framework for monocular inverse physics estimation of deformable objects using differentiable MPM simulation and 3D Gaussian Splatting, which jointly optimizes geometry, appearance, and physical parameters from a single camera view. We address these challenges through three visual-physical bridges: global scale alignment, physics-aware geometry refinement, and a differentiable position map, which together enable accurate optimization from monocular observations alone. We evaluate on Vid2Sim and our new dataset of elastic and plastic objects, showing that MonoPhysics outperforms existing baselines in monocular settings and achieves performance comparable to multi-view baselines using only a single camera. Our project page is available at https://daniel03c1.github.io/MonoPhysics/