Long-Term Video Interpolation with Bidirectional Predictive Network
This addresses the problem of generating longer video sequences for applications like video editing or simulation, though it appears incremental as it builds on existing interpolation methods.
The paper tackles long-term video interpolation by generating multiple frames between non-consecutive frames using a bidirectional predictive network (BiPN), achieving competitive results on benchmarks like Moving 2D Shapes and UCF101.
This paper considers the challenging task of long-term video interpolation. Unlike most existing methods that only generate few intermediate frames between existing adjacent ones, we attempt to speculate or imagine the procedure of an episode and further generate multiple frames between two non-consecutive frames in videos. In this paper, we present a novel deep architecture called bidirectional predictive network (BiPN) that predicts intermediate frames from two opposite directions. The bidirectional architecture allows the model to learn scene transformation with time as well as generate longer video sequences. Besides, our model can be extended to predict multiple possible procedures by sampling different noise vectors. A joint loss composed of clues in image and feature spaces and adversarial loss is designed to train our model. We demonstrate the advantages of BiPN on two benchmarks Moving 2D Shapes and UCF101 and report competitive results to recent approaches.