CVJan 25

StyleDecoupler: Generalizable Artistic Style Disentanglement

arXiv:2601.17697v12 citations

Originality Highly original

AI Analysis

This work addresses the problem of artistic style representation for researchers and practitioners in computer vision and art analysis, offering a generalizable method with applications in style retrieval and evaluation.

The paper tackles the challenge of disentangling artistic style from semantic content by proposing StyleDecoupler, an information-theoretic framework that isolates style features using uni-modal representations as content references, achieving state-of-the-art performance on style retrieval across datasets like WeART and WikiART.

Representing artistic style is challenging due to its deep entanglement with semantic content. We propose StyleDecoupler, an information-theoretic framework that leverages a key insight: multi-modal vision models encode both style and content, while uni-modal models suppress style to focus on content-invariant features. By using uni-modal representations as content-only references, we isolate pure style features from multi-modal embeddings through mutual information minimization. StyleDecoupler operates as a plug-and-play module on frozen Vision-Language Models without fine-tuning. We also introduce WeART, a large-scale benchmark of 280K artworks across 152 styles and 1,556 artists. Experiments show state-of-the-art performance on style retrieval across WeART and WikiART, while enabling applications like style relationship mapping and generative model evaluation. We release our method and dataset at this url.

View on arXiv PDF

Similar