CVSep 9, 2024

MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference

arXiv:2409.05250v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the problem of flexible color style transfer for image editing applications, though it appears incremental as it builds on existing style transfer and diffusion prior techniques.

The paper tackles color style transfer using multi-modality references (image and text) by introducing MRStyle, a framework that unifies image and text style features and outperforms state-of-the-art methods in both qualitative and quantitative evaluations.

In this paper, we introduce MRStyle, a comprehensive framework that enables color style transfer using multi-modality reference, including image and text. To achieve a unified style feature space for both modalities, we first develop a neural network called IRStyle, which generates stylized 3D lookup tables for image reference. This is accomplished by integrating an interaction dual-mapping network with a combined supervised learning pipeline, resulting in three key benefits: elimination of visual artifacts, efficient handling of high-resolution images with low memory usage, and maintenance of style consistency even in situations with significant color style variations. For text reference, we align the text feature of stable diffusion priors with the style feature of our IRStyle to perform text-guided color style transfer (TRStyle). Our TRStyle method is highly efficient in both training and inference, producing notable open-set text-guided transfer results. Extensive experiments in both image and text settings demonstrate that our proposed method outperforms the state-of-the-art in both qualitative and quantitative evaluations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes