DepthSplat: Connecting Gaussian Splatting and Depth
This work addresses the integration of 3D reconstruction and depth estimation for computer vision applications, offering incremental improvements by linking previously isolated methods.
The paper tackles the problem of connecting Gaussian splatting and depth estimation, showing that they can mutually benefit each other: it achieves state-of-the-art performance on datasets like ScanNet for both tasks and enables fast feed-forward reconstruction from 12 views in 0.6 seconds.
Gaussian splatting and single-view depth estimation are typically studied in isolation. In this paper, we present DepthSplat to connect Gaussian splatting and depth estimation and study their interactions. More specifically, we first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features, leading to high-quality feed-forward 3D Gaussian splatting reconstructions. We also show that Gaussian splatting can serve as an unsupervised pre-training objective for learning powerful depth models from large-scale multi-view posed datasets. We validate the synergy between Gaussian splatting and depth estimation through extensive ablation and cross-task transfer experiments. Our DepthSplat achieves state-of-the-art performance on ScanNet, RealEstate10K and DL3DV datasets in terms of both depth estimation and novel view synthesis, demonstrating the mutual benefits of connecting both tasks. In addition, DepthSplat enables feed-forward reconstruction from 12 input views (512x960 resolutions) in 0.6 seconds.