CVAug 6, 2025

MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

arXiv:2508.04297v21 citationsh-index: 8Has Code
Originality Highly original
AI Analysis

This addresses the problem of generalizable 3D reconstruction for computer vision applications, representing an incremental improvement through hybrid methods.

The paper tackles novel view synthesis across diverse baseline settings by integrating MVS and MDE features with a projection-and-sampling mechanism for depth fusion, achieving state-of-the-art performance on datasets like DTU and RealEstate10K.

We present Multi-Baseline Gaussian Splatting (MuGS), a generalized feed-forward approach for novel view synthesis that effectively handles diverse baseline settings, including sparse input views with both small and large baselines. Specifically, we integrate features from Multi-View Stereo (MVS) and Monocular Depth Estimation (MDE) to enhance feature representations for generalizable reconstruction. Next, We propose a projection-and-sampling mechanism for deep depth fusion, which constructs a fine probability volume to guide the regression of the feature map. Furthermore, We introduce a reference-view loss to improve geometry and optimization efficiency. We leverage 3D Gaussian representations to accelerate training and inference time while enhancing rendering quality. MuGS achieves state-of-the-art performance across multiple baseline settings and diverse scenarios ranging from simple objects (DTU) to complex indoor and outdoor scenes (RealEstate10K). We also demonstrate promising zero-shot performance on the LLFF and Mip-NeRF 360 datasets. Code is available at https://github.com/EuclidLou/MuGS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes