CVDec 5, 2024

Turbo3D: Ultra-fast Text-to-3D Generation

arXiv:2412.04470v113 citationsh-index: 45CVPR
Originality Incremental advance
AI Analysis

This addresses the need for faster 3D content creation in applications like gaming and VR, representing a strong incremental improvement in efficiency.

The paper tackles the problem of slow text-to-3D generation by introducing Turbo3D, which generates high-quality Gaussian splatting assets in under one second, outperforming previous baselines in speed and quality.

We present Turbo3D, an ultra-fast text-to-3D system capable of generating high-quality Gaussian splatting assets in under one second. Turbo3D employs a rapid 4-step, 4-view diffusion generator and an efficient feed-forward Gaussian reconstructor, both operating in latent space. The 4-step, 4-view generator is a student model distilled through a novel Dual-Teacher approach, which encourages the student to learn view consistency from a multi-view teacher and photo-realism from a single-view teacher. By shifting the Gaussian reconstructor's inputs from pixel space to latent space, we eliminate the extra image decoding time and halve the transformer sequence length for maximum efficiency. Our method demonstrates superior 3D generation results compared to previous baselines, while operating in a fraction of their runtime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes