WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction
This work addresses the computational inefficiency of prior diffusion-based methods for sparse-view 3D reconstruction, offering a more efficient solution for applications in computer vision and graphics.
The paper tackles the problem of 3D Gaussian object reconstruction in sparse-view settings, where performance drops sharply, by proposing WaveletGaussian, which shifts diffusion to the wavelet domain and uses a lightweight network for high-frequency refinement, achieving competitive rendering quality while substantially reducing training time.
3D Gaussian Splatting (3DGS) has become a powerful representation for image-based object reconstruction, yet its performance drops sharply in sparse-view settings. Prior works address this limitation by employing diffusion models to repair corrupted renders, subsequently using them as pseudo ground truths for later optimization. While effective, such approaches incur heavy computation from the diffusion fine-tuning and repair steps. We present WaveletGaussian, a framework for more efficient sparse-view 3D Gaussian object reconstruction. Our key idea is to shift diffusion into the wavelet domain: diffusion is applied only to the low-resolution LL subband, while high-frequency subbands are refined with a lightweight network. We further propose an efficient online random masking strategy to curate training pairs for diffusion fine-tuning, replacing the commonly used, but inefficient, leave-one-out strategy. Experiments across two benchmark datasets, Mip-NeRF 360 and OmniObject3D, show WaveletGaussian achieves competitive rendering quality while substantially reducing training time.