CVMay 30, 2023

Context-Preserving Two-Stage Video Domain Translation for Portrait Stylization

arXiv:2305.19135v11 citations
Originality Incremental advance
AI Analysis

This addresses video portrait stylization for practical real-world applications, offering an incremental improvement over prior image-based methods.

The paper tackles the problem of generating temporally coherent stylized videos from real human face images, achieving real-time performance with 0.011 seconds per frame latency and 5.6M parameters.

Portrait stylization, which translates a real human face image into an artistically stylized image, has attracted considerable interest and many prior works have shown impressive quality in recent years. However, despite their remarkable performances in the image-level translation tasks, prior methods show unsatisfactory results when they are applied to the video domain. To address the issue, we propose a novel two-stage video translation framework with an objective function which enforces a model to generate a temporally coherent stylized video while preserving context in the source video. Furthermore, our model runs in real-time with the latency of 0.011 seconds per frame and requires only 5.6M parameters, and thus is widely applicable to practical real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes