ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos
This addresses the problem of practical dynamic reconstruction for applications requiring real-time operation and consistency, though it appears incremental as it builds on existing SLAM and Gaussian splatting techniques.
The paper tackles online dynamic 3D scene reconstruction from monocular videos by disentangling static and dynamic parts within a SLAM system, achieving novel view renderings competitive with offline methods and on-par tracking with state-of-the-art dynamic SLAM methods.
Achieving truly practical dynamic 3D reconstruction requires online operation, global pose and map consistency, detailed appearance modeling, and the flexibility to handle both RGB and RGB-D inputs. However, existing SLAM methods typically merely remove the dynamic parts or require RGB-D input, while offline methods are not scalable to long video sequences, and current transformer-based feedforward methods lack global consistency and appearance details. To this end, we achieve online dynamic scene reconstruction by disentangling the static and dynamic parts within a SLAM system. The poses are tracked robustly with a novel motion masking strategy, and dynamic parts are reconstructed leveraging a progressive adaptation of a Motion Scaffolds graph. Our method yields novel view renderings competitive to offline methods and achieves on-par tracking with state-of-the-art dynamic SLAM methods.