Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting
This addresses a specific challenge in 3D asset creation for artists by enabling text-driven brush generation, though it appears incremental as it builds on existing methods like SDS.
The paper tackles the problem of generating reusable sculpting brushes as vector displacement maps from text, which existing models struggle with, and introduces Text2VDM, a framework that uses score distillation sampling with weighted token blending to produce diverse, high-quality brushes compatible with professional 3D software.
Professional 3D asset creation often requires diverse sculpting brushes to add surface details and geometric structures. Despite recent progress in 3D generation, producing reusable sculpting brushes compatible with artists' workflows remains an open and challenging problem. These sculpting brushes are typically represented as vector displacement maps (VDMs), which existing models cannot easily generate compared to natural images. This paper presents Text2VDM, a novel framework for text-to-VDM brush generation through the deformation of a dense planar mesh guided by score distillation sampling (SDS). The original SDS loss is designed for generating full objects and struggles with generating desirable sub-object structures from scratch in brush generation. We refer to this issue as semantic coupling, which we address by introducing weighted blending of prompt tokens to SDS, resulting in a more accurate target distribution and semantic guidance. Experiments demonstrate that Text2VDM can generate diverse, high-quality VDM brushes for sculpting surface details and geometric structures. Our generated brushes can be seamlessly integrated into mainstream modeling software, enabling various applications such as mesh stylization and real-time interactive modeling.