CVLGNov 21, 2021

Video Content Swapping Using GAN

arXiv:2111.10916v1
Originality Synthesis-oriented
AI Analysis

This addresses video generation for applications like data augmentation and AR/VR, but appears incremental as it builds on existing methods.

The paper tackles video generation by decomposing frames into content and pose, using a pre-trained pose detector and a generative model to synthesize videos from these codes.

Video generation is an interesting problem in computer vision. It is quite popular for data augmentation, special effect in move, AR/VR and so on. With the advances of deep learning, many deep generative models have been proposed to solve this task. These deep generative models provide away to utilize all the unlabeled images and videos online, since it can learn deep feature representations with unsupervised manner. These models can also generate different kinds of images, which have great value for visual application. However generating a video would be much more challenging since we need to model not only the appearances of objects in the video but also their temporal motion. In this work, we will break down any frame in the video into content and pose. We first extract the pose information from a video using a pre-trained human pose detection and use a generative model to synthesize the video based on the content code and pose code.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes