CVOct 26, 2024

UniVST: A Unified Framework for Training-free Localized Video Style Transfer

arXiv:2410.20084v517 citationsh-index: 32Has CodeIEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This work addresses the challenge of applying style transfer to specific objects in videos while maintaining detail and consistency, representing an incremental improvement over prior diffusion-based approaches.

The paper tackles the problem of localized video style transfer without training by introducing UniVST, a unified framework based on diffusion models that achieves superior performance in preserving object style and temporal consistency compared to existing methods.

This paper presents UniVST, a unified framework for localized video style transfer based on diffusion models. It operates without the need for training, offering a distinct advantage over existing diffusion methods that transfer style across entire videos. The endeavors of this paper comprise: (1) A point-matching mask propagation strategy that leverages the feature maps from the DDIM inversion. This streamlines the model's architecture by obviating the need for tracking models. (2) A training-free AdaIN-guided localized video stylization mechanism that operates at both the latent and attention levels. This balances content fidelity and style richness, mitigating the loss of localized details commonly associated with direct video stylization. (3) A sliding-window consistent smoothing scheme that harnesses optical flow within the pixel representation and refines predicted noise to update the latent space. This significantly enhances temporal consistency and diminishes artifacts in stylized video. Our proposed UniVST has been validated to be superior to existing methods in quantitative and qualitative metrics. It adeptly addresses the challenges of preserving the primary object's style while ensuring temporal consistency and detail preservation. Our code is available at https://github.com/QuanjianSong/UniVST.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes