Prior-Enhanced Gaussian Splatting for Dynamic Scene Reconstruction from Casual Video
This work addresses dynamic scene reconstruction for applications like AR/VR and robotics, but it is incremental as it builds on existing Gaussian Splatting methods.
The authors tackled dynamic scene reconstruction from casual monocular videos by enhancing priors in Dynamic Gaussian Splatting, resulting in a system that surpasses previous methods and delivers visibly superior renderings.
We introduce a fully automatic pipeline for dynamic scene reconstruction from casually captured monocular RGB videos. Rather than designing a new scene representation, we enhance the priors that drive Dynamic Gaussian Splatting. Video segmentation combined with epipolar-error maps yields object-level masks that closely follow thin structures; these masks (i) guide an object-depth loss that sharpens the consistent video depth, and (ii) support skeleton-based sampling plus mask-guided re-identification to produce reliable, comprehensive 2-D tracks. Two additional objectives embed the refined priors in the reconstruction stage: a virtual-view depth loss removes floaters, and a scaffold-projection loss ties motion nodes to the tracks, preserving fine geometry and coherent motion. The resulting system surpasses previous monocular dynamic scene reconstruction methods and delivers visibly superior renderings