CVMar 2

Sparse View Distractor-Free Gaussian Splatting

arXiv:2603.01603v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of robust 3D scene reconstruction for applications like augmented reality or robotics, but it is incremental as it builds on existing distractor-free methods.

The paper tackles the problem of transient object removal in 3D Gaussian Splatting under sparse-view conditions, achieving enhanced performance by incorporating geometry and semantic priors from foundation models.

3D Gaussian Splatting (3DGS) enables efficient training and fast novel view synthesis in static environments. To address challenges posed by transient objects, distractor-free 3DGS methods have emerged and shown promising results when dense image captures are available. However, their performance degrades significantly under sparse input conditions. This limitation primarily stems from the reliance on the color residual heuristics to guide the training, which becomes unreliable with limited observations. In this work, we propose a framework to enhance distractor-free 3DGS under sparse-view conditions by incorporating rich prior information. Specifically, we first adopt the geometry foundation model VGGT to estimate camera parameters and generate a dense set of initial 3D points. Then, we harness the attention maps from VGGT for efficient and accurate semantic entity matching. Additionally, we utilize Vision-Language Models (VLMs) to further identify and preserve the large static regions in the scene. We also demonstrate how these priors can be seamlessly integrated into existing distractor-free 3DGS methods. Extensive experiments confirm the effectiveness and robustness of our approach in mitigating transient distractors for sparse-view 3DGS training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes