CVAIJan 7

IDESplat: Iterative Depth Probability Estimation for Generalizable 3D Gaussian Splatting

arXiv:2601.03824v13 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses a bottleneck in 3D scene reconstruction for computer vision applications, offering an incremental improvement over existing depth estimation methods.

The paper tackles the problem of unstable and coarse depth estimation in generalizable 3D Gaussian Splatting by proposing IDESplat, which iteratively applies warp operations to refine depth probability estimates. The method achieves state-of-the-art reconstruction quality with real-time efficiency, outperforming DepthSplat by 0.33 dB in PSNR on RE10K while using only 10.7% of the parameters and 70% of the memory.

Generalizable 3D Gaussian Splatting aims to directly predict Gaussian parameters using a feed-forward network for scene reconstruction. Among these parameters, Gaussian means are particularly difficult to predict, so depth is usually estimated first and then unprojected to obtain the Gaussian sphere centers. Existing methods typically rely solely on a single warp to estimate depth probability, which hinders their ability to fully leverage cross-view geometric cues, resulting in unstable and coarse depth maps. To address this limitation, we propose IDESplat, which iteratively applies warp operations to boost depth probability estimation for accurate Gaussian mean prediction. First, to eliminate the inherent instability of a single warp, we introduce a Depth Probability Boosting Unit (DPBU) that integrates epipolar attention maps produced by cascading warp operations in a multiplicative manner. Next, we construct an iterative depth estimation process by stacking multiple DPBUs, progressively identifying potential depth candidates with high likelihood. As IDESplat iteratively boosts depth probability estimates and updates the depth candidates, the depth map is gradually refined, resulting in accurate Gaussian means. We conduct experiments on RealEstate10K, ACID, and DL3DV. IDESplat achieves outstanding reconstruction quality and state-of-the-art performance with real-time efficiency. On RE10K, it outperforms DepthSplat by 0.33 dB in PSNR, using only 10.7% of the parameters and 70% of the memory. Additionally, our IDESplat improves PSNR by 2.95 dB over DepthSplat on the DTU dataset in cross-dataset experiments, demonstrating its strong generalization ability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes