CVAIOct 20, 2022

TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition

Stanford
arXiv:2210.11277v2108 citationsh-index: 35
Originality Highly original
AI Analysis

This addresses the challenge of creating realistic 3D content from text for computer vision and graphics applications, representing a novel method rather than an incremental improvement.

The paper tackles the problem of stylizing photorealistic 3D appearances from text prompts by proposing TANGO, which disentangles appearance into reflectance, geometry, and lighting, and shows it outperforms existing methods in quality, consistency, and robustness for low-quality meshes.

Creation of 3D content by stylization is a promising yet challenging problem in computer vision and graphics research. In this work, we focus on stylizing photorealistic appearance renderings of a given surface mesh of arbitrary topology. Motivated by the recent surge of cross-modal supervision of the Contrastive Language-Image Pre-training (CLIP) model, we propose TANGO, which transfers the appearance style of a given 3D shape according to a text prompt in a photorealistic manner. Technically, we propose to disentangle the appearance style as the spatially varying bidirectional reflectance distribution function, the local geometric variation, and the lighting condition, which are jointly optimized, via supervision of the CLIP loss, by a spherical Gaussians based differentiable renderer. As such, TANGO enables photorealistic 3D style transfer by automatically predicting reflectance effects even for bare, low-quality meshes, without training on a task-specific dataset. Extensive experiments show that TANGO outperforms existing methods of text-driven 3D style transfer in terms of photorealistic quality, consistency of 3D geometry, and robustness when stylizing low-quality meshes. Our codes and results are available at our project webpage https://cyw-3d.github.io/tango/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes