CVAIOct 24, 2024

3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation

arXiv:2410.18974v217 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses the challenge of generating geometrically consistent 3D objects from text or images, which is important for applications in computer graphics and AI, though it is incremental as it builds on existing diffusion models.

The paper tackles the problem of geometric inconsistency in multi-view diffusion models for 3D generation by introducing 3D-Adapter, a plug-in module that infuses 3D geometry awareness, resulting in enhanced geometry quality and enabling high-quality 3D generation from text-to-image models.

Multi-view image diffusion models have significantly advanced open-domain 3D object generation. However, most existing models rely on 2D network architectures that lack inherent 3D biases, resulting in compromised geometric consistency. To address this challenge, we introduce 3D-Adapter, a plug-in module designed to infuse 3D geometry awareness into pretrained image diffusion models. Central to our approach is the idea of 3D feedback augmentation: for each denoising step in the sampling loop, 3D-Adapter decodes intermediate multi-view features into a coherent 3D representation, then re-encodes the rendered RGBD views to augment the pretrained base model through feature addition. We study two variants of 3D-Adapter: a fast feed-forward version based on Gaussian splatting and a versatile training-free version utilizing neural fields and meshes. Our extensive experiments demonstrate that 3D-Adapter not only greatly enhances the geometry quality of text-to-multi-view models such as Instant3D and Zero123++, but also enables high-quality 3D generation using the plain text-to-image Stable Diffusion. Furthermore, we showcase the broad application potential of 3D-Adapter by presenting high quality results in text-to-3D, image-to-3D, text-to-texture, and text-to-avatar tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes