Efficient multi-view training for 3D Gaussian Splatting
This work addresses a training bottleneck in 3DGS for inverse rendering applications, offering an incremental improvement over existing methods.
The paper tackles the problem of suboptimal optimization in 3D Gaussian Splatting (3DGS) due to single-view mini-batch training, which increases gradient variance, and proposes efficient multi-view training methods including modified rasterization, a 3D distance-aware D-SSIM loss, and multi-view adaptive density control, resulting in significantly enhanced performance for 3DGS and its variants.
3D Gaussian Splatting (3DGS) has emerged as a preferred choice alongside Neural Radiance Fields (NeRF) in inverse rendering due to its superior rendering speed. Currently, the common approach in 3DGS is to utilize "single-view" mini-batch training, where only one image is processed per iteration, in contrast to NeRF's "multi-view" mini-batch training, which leverages multiple images. We observe that such single-view training can lead to suboptimal optimization due to increased variance in mini-batch stochastic gradients, highlighting the necessity for multi-view training. However, implementing multi-view training in 3DGS poses challenges. Simply rendering multiple images per iteration incurs considerable overhead and may result in suboptimal Gaussian densification due to its reliance on single-view assumptions. To address these issues, we modify the rasterization process to minimize the overhead associated with multi-view training and propose a 3D distance-aware D-SSIM loss and multi-view adaptive density control that better suits multi-view scenarios. Our experiments demonstrate that the proposed methods significantly enhance the performance of 3DGS and its variants, freeing 3DGS from the constraints of single-view training.