Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation
This addresses the problem of stylized 3D content creation for artists and designers, but it is incremental as it builds on existing text-to-3D methods.
The paper tackles generating 3D objects with specific styles by using a text prompt and style reference image, achieving strong visual performance as shown in user studies.
We present a method to generate 3D objects in styles. Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model with the content aligning with the text prompt and the style following the reference image. To simultaneously generate the 3D object and perform style transfer in one go, we propose a stylized score distillation loss to guide a text-to-3D optimization process to output visually plausible geometry and appearance. Our stylized score distillation is based on a combination of an original pretrained text-to-image model and its modified sibling with the key and value features of self-attention layers manipulated to inject styles from the reference image. Comparisons with state-of-the-art methods demonstrated the strong visual performance of our method, further supported by the quantitative results from our user study.