CVDec 5, 2024

Turbo3D: Ultra-fast Text-to-3D Generation

Hanzhe Hu, Tianwei Yin, Fujun Luan, Yiwei Hu, Hao Tan, Zexiang Xu, Sai Bi, Shubham Tulsiani, Kai Zhang

arXiv:2412.04470v115.313 citationsh-index: 45CVPR

Originality Incremental advance

AI Analysis

This addresses the need for faster 3D content creation in applications like gaming and VR, representing a strong incremental improvement in efficiency.

The paper tackles the problem of slow text-to-3D generation by introducing Turbo3D, which generates high-quality Gaussian splatting assets in under one second, outperforming previous baselines in speed and quality.

We present Turbo3D, an ultra-fast text-to-3D system capable of generating high-quality Gaussian splatting assets in under one second. Turbo3D employs a rapid 4-step, 4-view diffusion generator and an efficient feed-forward Gaussian reconstructor, both operating in latent space. The 4-step, 4-view generator is a student model distilled through a novel Dual-Teacher approach, which encourages the student to learn view consistency from a multi-view teacher and photo-realism from a single-view teacher. By shifting the Gaussian reconstructor's inputs from pixel space to latent space, we eliminate the extra image decoding time and halve the transformer sequence length for maximum efficiency. Our method demonstrates superior 3D generation results compared to previous baselines, while operating in a fraction of their runtime.

View on arXiv PDF

Similar