CVNov 15, 2025

SRSplat: Feed-Forward Super-Resolution Gaussian Splatting from Sparse Multi-View Images

arXiv:2511.12040v12 citationsh-index: 2
Originality Highly original
AI Analysis

This addresses the need for detailed 3D reconstruction in applications like autonomous driving and embodied AI, though it is incremental as it builds on existing Gaussian splatting methods with novel enhancements.

The paper tackles the problem of feed-forward 3D reconstruction from sparse, low-resolution images, which often fails to recover fine texture details, by proposing SRSplat, a framework that leverages external reference images and internal texture cues to reconstruct high-resolution 3D scenes; it outperforms existing methods on datasets like RealEstate10K, ACID, and DTU, with strong generalization capabilities.

Feed-forward 3D reconstruction from sparse, low-resolution (LR) images is a crucial capability for real-world applications, such as autonomous driving and embodied AI. However, existing methods often fail to recover fine texture details. This limitation stems from the inherent lack of high-frequency information in LR inputs. To address this, we propose \textbf{SRSplat}, a feed-forward framework that reconstructs high-resolution 3D scenes from only a few LR views. Our main insight is to compensate for the deficiency of texture information by jointly leveraging external high-quality reference images and internal texture cues. We first construct a scene-specific reference gallery, generated for each scene using Multimodal Large Language Models (MLLMs) and diffusion models. To integrate this external information, we introduce the \textit{Reference-Guided Feature Enhancement (RGFE)} module, which aligns and fuses features from the LR input images and their reference twin image. Subsequently, we train a decoder to predict the Gaussian primitives using the multi-view fused feature obtained from \textit{RGFE}. To further refine predicted Gaussian primitives, we introduce \textit{Texture-Aware Density Control (TADC)}, which adaptively adjusts Gaussian density based on the internal texture richness of the LR inputs. Extensive experiments demonstrate that our SRSplat outperforms existing methods on various datasets, including RealEstate10K, ACID, and DTU, and exhibits strong cross-dataset and cross-resolution generalization capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes