CVAIApr 6

GA-GS: Generation-Assisted Gaussian Splatting for Static Scene Reconstruction

arXiv:2604.0433173.6
AI Analysis

This addresses a challenge in applications like virtual reality and autonomous driving by improving static scene reconstruction from videos with dynamic objects, though it is incremental as it builds on existing Gaussian splatting and inpainting techniques.

The paper tackles the problem of reconstructing static 3D scenes from monocular videos with dynamic objects, which often occlude background regions, by proposing GA-GS, a method that uses a diffusion model to inpaint occluded areas and achieves state-of-the-art performance, particularly in scenarios with large-scale occlusions.

Reconstructing static 3D scene from monocular video with dynamic objects is important for numerous applications such as virtual reality and autonomous driving. Current approaches typically rely on background for static scene reconstruction, limiting the ability to recover regions occluded by dynamic objects. In this paper, we propose GA-GS, a Generation-Assisted Gaussian Splatting method for Static Scene Reconstruction. The key innovation of our work lies in leveraging generation to assist in reconstructing occluded regions. We employ a motion-aware module to segment and remove dynamic regions, and thenuse a diffusion model to inpaint the occluded areas, providing pseudo-ground-truth supervision. To balance contributions from real background and generated region, we introduce a learnable authenticity scalar for each Gaussian primitive, which dynamically modulates opacity during splatting for authenticity-aware rendering and supervision. Since no existing dataset provides ground-truth static scene of video with dynamic objects, we construct a dataset named Trajectory-Match, using a fixed-path robot to record each scene with/without dynamic objects, enabling quantitative evaluation in reconstruction of occluded regions. Extensive experiments on both the DAVIS and our dataset show that GA-GS achieves state-of-the-art performance in static scene reconstruction, especially in challenging scenarios with large-scale, persistent occlusions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes