CVAug 19, 2019

Video synthesis of human upper body with realistic face

arXiv:1908.06607v316 citations
AI Analysis

This addresses video synthesis for applications like virtual avatars or entertainment, but appears incremental as it builds on existing GAN-based techniques with specific intermediate representations.

The paper tackles the problem of generating realistic upper body videos of a target person that match the body motion, facial expression, and pose from a source video, using a generative adversarial learning approach with intermediate representations like keypoints and facial landmarks. Experimental results show the method is effective, though no concrete numbers are provided.

This paper presents a generative adversarial learning-based human upper body video synthesis approach to generate an upper body video of target person that is consistent with the body motion, face expression, and pose of the person in source video. We use upper body keypoints, facial action units and poses as intermediate representations between source video and target video. Instead of directly transferring the source video to the target video, we firstly map the source person's facial action units and poses into the target person's facial landmarks, then combine the normalized upper body keypoints and generated facial landmarks with spatio-temporal smoothing to generate the corresponding target video's image. Experimental results demonstrated the effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes