CVNov 9, 2025

V-Shuffle: Zero-Shot Style Transfer via Value Shuffle

arXiv:2511.06365v1h-index: 12
Originality Incremental advance
AI Analysis

This addresses content leakage in style transfer for image generation applications, representing an incremental improvement over existing attention-based methods.

The paper tackles content leakage in attention-based style transfer by proposing V-Shuffle, a zero-shot method that shuffles value features in diffusion model attention layers to preserve low-level style while disrupting semantic content. Results show it outperforms previous state-of-the-art methods with single style images and achieves excellent performance with multiple style images.

Attention injection-based style transfer has achieved remarkable progress in recent years. However, existing methods often suffer from content leakage, where the undesired semantic content of the style image mistakenly appears in the stylized output. In this paper, we propose V-Shuffle, a zero-shot style transfer method that leverages multiple style images from the same style domain to effectively navigate the trade-off between content preservation and style fidelity. V-Shuffle implicitly disrupts the semantic content of the style images by shuffling the value features within the self-attention layers of the diffusion model, thereby preserving low-level style representations. We further introduce a Hybrid Style Regularization that complements these low-level representations with high-level style textures to enhance style fidelity. Empirical results demonstrate that V-Shuffle achieves excellent performance when utilizing multiple style images. Moreover, when applied to a single style image, V-Shuffle outperforms previous state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes