MicroDreamer: Efficient 3D Generation in $\sim$20 Seconds by Score-based Iterative Reconstruction
This addresses the slow speed of zero-shot 3D generation for applications like graphics and simulation, though it is incremental as it builds on existing diffusion-based methods.
The paper tackles the inefficiency of optimization-based 3D generation methods by introducing score-based iterative reconstruction (SIR), which reduces function evaluations and enables pixel-space optimization, resulting in MicroDreamer being 5-20 times faster than SDS while maintaining comparable performance and generating meshes in about 20 seconds.
Optimization-based approaches, such as score distillation sampling (SDS), show promise in zero-shot 3D generation but suffer from low efficiency, primarily due to the high number of function evaluations (NFEs) required for each sample and the limitation of optimization confined to latent space. This paper introduces score-based iterative reconstruction (SIR), an efficient and general algorithm mimicking a differentiable 3D reconstruction process to reduce the NFEs and enable optimization in pixel space. Given a single set of images sampled from a multi-view score-based diffusion model, SIR repeatedly optimizes 3D parameters, unlike the single-step optimization in SDS. With other improvements in training, we present an efficient approach called MicroDreamer that generally applies to various 3D representations and 3D generation tasks. In particular, MicroDreamer is 5-20 times faster than SDS in generating neural radiance field while retaining a comparable performance and takes about 20 seconds to create meshes from 3D Gaussian splatting on a single A100 GPU, halving the time of the fastest optimization-based baseline DreamGaussian with significantly superior performance compared to the measurement standard deviation. Our code is available at https://github.com/ML-GSAI/MicroDreamer.