CVLGApr 2, 2025

Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis

arXiv:2504.01960v12 citationsh-index: 30
Originality Highly original
AI Analysis

This addresses the challenge of robust 3D reconstruction in real-world, unconstrained settings for applications like robotics, AR/VR, and autonomous systems, representing a novel integration rather than an incremental improvement.

The paper tackles the problem of degraded quality in 3D reconstruction and novel view synthesis for large-scale, unconstrained environments with sparse or uneven input data, proposing GS-Diff, a 3D Gaussian Splatting framework guided by a multi-view diffusion model that outperforms state-of-the-art baselines on four benchmarks.

Recent advancements in 3D Gaussian Splatting (3DGS) and Neural Radiance Fields (NeRF) have achieved impressive results in real-time 3D reconstruction and novel view synthesis. However, these methods struggle in large-scale, unconstrained environments where sparse and uneven input coverage, transient occlusions, appearance variability, and inconsistent camera settings lead to degraded quality. We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address these limitations. By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones, enabling robust optimization even with sparse data. GS-Diff further integrates several enhancements, including appearance embedding, monocular depth priors, dynamic object modeling, anisotropy regularization, and advanced rasterization techniques, to tackle geometric and photometric challenges in real-world settings. Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes