Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model
This addresses the challenge of versatile scene recreation for 3D content creation, offering an incremental improvement over prior methods that struggled with quality.
The paper tackles the problem of generating and inserting new objects into 3D content represented by Gaussian Splatting, proposing a multi-view diffusion model (MVInpainter) with a ControlNet-based module and mask-aware 3D reconstruction, which outperforms existing methods in producing view-consistent and higher-quality insertions.
Generating and inserting new objects into 3D content is a compelling approach for achieving versatile scene recreation. Existing methods, which rely on SDS optimization or single-view inpainting, often struggle to produce high-quality results. To address this, we propose a novel method for object insertion in 3D content represented by Gaussian Splatting. Our approach introduces a multi-view diffusion model, dubbed MVInpainter, which is built upon a pre-trained stable video diffusion model to facilitate view-consistent object inpainting. Within MVInpainter, we incorporate a ControlNet-based conditional injection module to enable controlled and more predictable multi-view generation. After generating the multi-view inpainted results, we further propose a mask-aware 3D reconstruction technique to refine Gaussian Splatting reconstruction from these sparse inpainted views. By leveraging these fabricate techniques, our approach yields diverse results, ensures view-consistent and harmonious insertions, and produces better object quality. Extensive experiments demonstrate that our approach outperforms existing methods.