SPG: Style-Prompting Guidance for Style-Specific Content Creation
This addresses the problem of style-specific content creation for users of text-to-image models, representing an incremental improvement by integrating with existing frameworks like ControlNet.
The paper tackles the challenge of controlling visual style in text-to-image diffusion models by proposing Style-Prompting Guidance (SPG), a novel sampling strategy that uses a style noise vector to guide generation toward target styles, achieving semantic fidelity and style consistency.
Although recent text-to-image (T2I) diffusion models excel at aligning generated images with textual prompts, controlling the visual style of the output remains a challenging task. In this work, we propose Style-Prompting Guidance (SPG), a novel sampling strategy for style-specific image generation. SPG constructs a style noise vector and leverages its directional deviation from unconditional noise to guide the diffusion process toward the target style distribution. By integrating SPG with Classifier-Free Guidance (CFG), our method achieves both semantic fidelity and style consistency. SPG is simple, robust, and compatible with controllable frameworks like ControlNet and IPAdapter, making it practical and widely applicable. Extensive experiments demonstrate the effectiveness and generality of our approach compared to state-of-the-art methods. Code is available at https://github.com/Rumbling281441/SPG.