CVGRAug 21, 2018

Deep Video-Based Performance Cloning

arXiv:1808.06847v183 citations
Originality Incremental advance
AI Analysis

This addresses video synthesis for performance cloning, enabling applications in entertainment and media, but it appears incremental as it builds on existing generative models.

The paper tackles the problem of generating videos where a target actor reenacts performances from other videos, using only ordinary video segments without motion capture or depth. It demonstrates promising results in generating temporally coherent videos for challenging scenarios like different dance performances.

We present a new video-based performance cloning technique. After training a deep generative network using a reference video capturing the appearance and dynamics of a target actor, we are able to generate videos where this actor reenacts other performances. All of the training data and the driving performances are provided as ordinary video segments, without motion capture or depth information. Our generative model is realized as a deep neural network with two branches, both of which train the same space-time conditional generator, using shared weights. One branch, responsible for learning to generate the appearance of the target actor in various poses, uses \emph{paired} training data, self-generated from the reference video. The second branch uses unpaired data to improve generation of temporally coherent video renditions of unseen pose sequences. We demonstrate a variety of promising results, where our method is able to generate temporally coherent videos, for challenging scenarios where the reference and driving videos consist of very different dance performances. Supplementary video: https://youtu.be/JpwsEeqNhhA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes