CVApr 9, 2019

Multimodal Style Transfer via Graph Cuts

arXiv:1904.04443v694 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of generating more aesthetically pleasing stylized images for applications in computer vision and graphics, representing an incremental improvement over existing style transfer methods.

The paper tackles the problem of neural style transfer by addressing the uniform treatment of semantic patterns in style images, which leads to unpleasing results on complex styles. It introduces Multimodal Style Transfer (MST), which clusters style features into sub-styles and matches them with content features using graph cuts, resulting in improved effectiveness, robustness, and flexibility as demonstrated in experiments.

An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices. Alternative approaches have represented styles by decomposing them into local pixel or neural patches. Despite the recent progress, most existing methods treat the semantic patterns of style image uniformly, resulting unpleasing results on complex styles. In this paper, we introduce a more flexible and general universal style transfer technique: multimodal style transfer (MST). MST explicitly considers the matching of semantic patterns in content and style images. Specifically, the style image features are clustered into sub-style components, which are matched with local content features under a graph cut formulation. A reconstruction network is trained to transfer each sub-style and render the final stylized result. We also generalize MST to improve some existing methods. Extensive experiments demonstrate the superior effectiveness, robustness, and flexibility of MST.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes