CVFeb 17

Semantic-Guided 3D Gaussian Splatting for Transient Object Removal

arXiv:2602.15516v11 citations
Originality Incremental advance
AI Analysis

This addresses the issue of transient object removal in 3D reconstruction for applications like casual multi-view captures, though it is incremental as it builds on existing 3DGS methods.

The paper tackled the problem of ghosting artifacts from transient objects in 3D Gaussian Splatting reconstructions by proposing a semantic filtering framework using vision-language models, resulting in consistent improvements in reconstruction quality on the RobustNeRF benchmark with minimal memory overhead and real-time rendering.

Transient objects in casual multi-view captures cause ghosting artifacts in 3D Gaussian Splatting (3DGS) reconstruction. Existing solutions relied on scene decomposition at significant memory cost or on motion-based heuristics that were vulnerable to parallax ambiguity. A semantic filtering framework was proposed for category-aware transient removal using vision-language models. CLIP similarity scores between rendered views and distractor text prompts were accumulated per-Gaussian across training iterations. Gaussians exceeding a calibrated threshold underwent opacity regularization and periodic pruning. Unlike motion-based approaches, semantic classification resolved parallax ambiguity by identifying object categories independently of motion patterns. Experiments on the RobustNeRF benchmark demonstrated consistent improvement in reconstruction quality over vanilla 3DGS across four sequences, while maintaining minimal memory overhead and real-time rendering performance. Threshold calibration and comparisons with baselines validated semantic guidance as a practical strategy for transient removal in scenarios with predictable distractor categories.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes