CVLGMLJan 23, 2019

Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation

arXiv:1901.11384v1
Originality Incremental advance
AI Analysis

This work addresses video generation for unsupervised learning applications, but it is incremental as it builds on existing GAN methods with a novel training scheme.

The paper tackles the problem of generating realistic video sequences by introducing a two-step GAN framework that first trains a static frame generator and then a recurrent model to navigate the image manifold, resulting in more natural-looking scenes.

In this work, we introduce a two-step framework for generative modeling of temporal data. Specifically, the generative adversarial networks (GANs) setting is employed to generate synthetic scenes of moving objects. To do so, we propose a two-step training scheme within which: a generator of static frames is trained first. Afterwards, a recurrent model is trained with the goal of providing a sequence of inputs to the previously trained frames generator, thus yielding scenes which look natural. The adversarial setting is employed in both training steps. However, with the aim of avoiding known training instabilities in GANs, a multiple discriminator approach is used to train both models. Results in the studied video dataset indicate that, by employing such an approach, the recurrent part is able to learn how to coherently navigate the image manifold induced by the frames generator, thus yielding more natural-looking scenes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes