CVJan 26, 2023

ITstyler: Image-optimized Text-based Style Transfer

arXiv:2301.10916v18 citationsh-index: 8
Originality Incremental advance
AI Analysis

This enables more efficient text-guided artistic image synthesis for creative applications, though it appears incremental as it builds on existing VGG and CLIP frameworks.

The paper tackles the problem of text-based style transfer requiring optimization time or paired data by developing a method that converts text to VGG style space using CLIP embeddings, achieving real-time transfer without inference optimization.

Text-based style transfer is a newly-emerging research topic that uses text information instead of style image to guide the transfer process, significantly extending the application scenario of style transfer. However, previous methods require extra time for optimization or text-image paired data, leading to limited effectiveness. In this work, we achieve a data-efficient text-based style transfer method that does not require optimization at the inference stage. Specifically, we convert text input to the style space of the pre-trained VGG network to realize a more effective style swap. We also leverage CLIP's multi-modal embedding space to learn the text-to-style mapping with the image dataset only. Our method can transfer arbitrary new styles of text input in real-time and synthesize high-quality artistic images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes