CVAug 31, 2023

MVDream: Multi-view Diffusion for 3D Generation

arXiv:2308.16512v41001 citationsh-index: 58
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating coherent 3D models from text for applications in computer graphics and AI, representing an incremental improvement by building on existing diffusion and 3D generation techniques.

The paper tackles the problem of generating consistent 3D content from text prompts by introducing MVDream, a multi-view diffusion model that learns from 2D and 3D data, achieving generalizability and consistency, and it significantly enhances the consistency and stability of existing 2D-lifting methods for 3D generation.

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2D examples, akin to DreamBooth, but for 3D generation.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes