CVMMOct 14, 2021

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

arXiv:2110.11894v2
Originality Incremental advance
AI Analysis

This addresses a domain-specific problem in computer vision for generating realistic person videos with style transfer, but it appears incremental as it builds on existing AdaIN-based architectures.

The paper tackles the problem of clothes style transfer for person video generation, which is challenging due to appearance variations and video scenarios, and proposes a novel framework with disentangled multi-branch encoders and a shared decoder that achieves superior image quality and video coherence on the TEDXPeople benchmark.

Clothes style transfer for person video generation is a challenging task, due to drastic variations of intra-person appearance and video scenarios. To tackle this problem, most recent AdaIN-based architectures are proposed to extract clothes and scenario features for generation. However, these approaches suffer from being short of fine-grained details and are prone to distort the origin person. To further improve the generation performance, we propose a novel framework with disentangled multi-branch encoders and a shared decoder. Moreover, to pursue the strong video spatio-temporal consistency, an inner-frame discriminator is delicately designed with input being cross-frame difference. Besides, the proposed framework possesses the property of scenario adaptation. Extensive experiments on the TEDXPeople benchmark demonstrate the superiority of our method over state-of-the-art approaches in terms of image quality and video coherence.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes