CVDec 5, 2023

TPA3D: Triplane Attention for Fast Text-to-3D Generation

arXiv:2312.02647v47 citationsh-index: 4ECCV
Originality Incremental advance
AI Analysis

This addresses the need for fast 3D generation from text in applications like gaming or design, though it is incremental as it builds on existing GAN and attention methods.

The paper tackles the problem of slow text-to-3D generation by proposing TPA3D, a GAN-based model that uses triplane attention to generate 3D meshes from text, achieving high-quality results with improved computational efficiency.

Due to the lack of large-scale text-3D correspondence data, recent text-to-3D generation works mainly rely on utilizing 2D diffusion models for synthesizing 3D data. Since diffusion-based methods typically require significant optimization time for both training and inference, the use of GAN-based models would still be desirable for fast 3D generation. In this work, we propose Triplane Attention for text-guided 3D generation (TPA3D), an end-to-end trainable GAN-based deep learning model for fast text-to-3D generation. With only 3D shape data and their rendered 2D images observed during training, our TPA3D is designed to retrieve detailed visual descriptions for synthesizing the corresponding 3D mesh data. This is achieved by the proposed attention mechanisms on the extracted sentence and word-level text features. In our experiments, we show that TPA3D generates high-quality 3D textured shapes aligned with fine-grained descriptions, while impressive computation efficiency can be observed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes