CVGRLGDec 4, 2023

Style Aligned Image Generation via Shared Attention

arXiv:2312.02133v2254 citationsh-index: 15CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of style control in T2I models for creative applications, though it appears incremental as it builds on existing diffusion methods.

The paper tackles the problem of ensuring consistent style in text-to-image generation by introducing StyleAligned, a technique that uses minimal attention sharing during diffusion to maintain style alignment across images, achieving high-quality synthesis and fidelity.

Large-scale Text-to-Image (T2I) models have rapidly gained prominence across creative fields, generating visually compelling outputs from textual prompts. However, controlling these models to ensure consistent style remains challenging, with existing methods necessitating fine-tuning and manual intervention to disentangle content and style. In this paper, we introduce StyleAligned, a novel technique designed to establish style alignment among a series of generated images. By employing minimal `attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. This approach allows for the creation of style-consistent images using a reference style through a straightforward inversion operation. Our method's evaluation across diverse styles and text prompts demonstrates high-quality synthesis and fidelity, underscoring its efficacy in achieving consistent style across various inputs.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes