CVApr 10, 2024

Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer

Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan

arXiv:2404.06835v13.71 citationsh-index: 52

Originality Incremental advance

AI Analysis

This work solves the problem of consistent structure preservation in style transfer for users of text-to-image models, representing an incremental improvement over previous methods.

The paper tackles text-driven style transfer in text-to-image diffusion models by addressing structure distortion issues, proposing Adaptive Style Incorporation (ASI) with Siamese Cross-Attention and Adaptive Content-Style Blending, which experimentally shows improved performance in structure preservation and stylized effects.

In this work, we target the task of text-driven style transfer in the context of text-to-image (T2I) diffusion models. The main challenge is consistent structure preservation while enabling effective style transfer effects. The past approaches in this field directly concatenate the content and style prompts for a prompt-level style injection, leading to unavoidable structure distortions. In this work, we propose a novel solution to the text-driven style transfer task, namely, Adaptive Style Incorporation~(ASI), to achieve fine-grained feature-level style incorporation. It consists of the Siamese Cross-Attention~(SiCA) to decouple the single-track cross-attention to a dual-track structure to obtain separate content and style features, and the Adaptive Content-Style Blending (AdaBlending) module to couple the content and style information from a structure-consistent manner. Experimentally, our method exhibits much better performance in both structure preservation and stylized effects.

View on arXiv PDF

Similar