3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions
This addresses the challenge of text-guided 3D shape editing for users in graphics and AI, offering a flexible and generalizable solution.
The paper tackles the problem of localizing semantic regions on 3D meshes using text descriptions, enabling out-of-domain applications like adding clothing to animal models, and achieves this without requiring 3D datasets or annotations.
We present 3D Highlighter, a technique for localizing semantic regions on a mesh using text as input. A key feature of our system is the ability to interpret "out-of-domain" localizations. Our system demonstrates the ability to reason about where to place non-obviously related concepts on an input 3D shape, such as adding clothing to a bare 3D animal model. Our method contextualizes the text description using a neural field and colors the corresponding region of the shape using a probability-weighted blend. Our neural optimization is guided by a pre-trained CLIP encoder, which bypasses the need for any 3D datasets or 3D annotations. Thus, 3D Highlighter is highly flexible, general, and capable of producing localizations on a myriad of input shapes. Our code is publicly available at https://github.com/threedle/3DHighlighter.