IVCVDec 5, 2018

Learning to Take Directions One Step at a Time

arXiv:1812.01874v312 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of controllable video generation for applications in animation and simulation, though it is incremental as it builds on existing motion control methods.

The paper tackles the problem of generating video sequences from a single image using motion strokes as control signals, achieving the ability to produce arbitrary numbers of frames across multiple datasets like MNIST and Human3.6M.

We present a method to generate a video sequence given a single image. Because items in an image can be animated in arbitrarily many different ways, we introduce as control signal a sequence of motion strokes. Such control signal can be automatically transferred from other videos, e.g., via bounding box tracking. Each motion stroke provides the direction to the moving object in the input image and we aim to train a network to generate an animation following a sequence of such directions. To address this task we design a novel recurrent architecture, which can be trained easily and effectively thanks to an explicit separation of past, future and current states. As we demonstrate in the experiments, our proposed architecture is capable of generating an arbitrary number of frames from a single image and a sequence of motion strokes. Key components of our architecture are an autoencoding constraint to ensure consistency with the past and a generative adversarial scheme to ensure that images look realistic and are temporally smooth. We demonstrate the effectiveness of our approach on the MNIST, KTH, Human3.6M, Push and Weizmann datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes