CVLGAug 21, 2023

Long-Term Prediction of Natural Video Sequences with Robust Video Predictors

arXiv:2308.11079v12 citationsh-index: 45
Originality Incremental advance
AI Analysis

This work addresses the challenge of predicting complex natural video scenes over extended periods, which is incremental as it builds on existing methods with specific enhancements.

The paper tackled the problem of long-term prediction of natural video sequences by introducing improvements to create Robust Video Predictors (RoViPs), resulting in the ability to produce very long, realistic video sequences through iterated single-step prediction.

Predicting high dimensional video sequences is a curiously difficult problem. The number of possible futures for a given video sequence grows exponentially over time due to uncertainty. This is especially evident when trying to predict complicated natural video scenes from a limited snapshot of the world. The inherent uncertainty accumulates the further into the future you predict making long-term prediction very difficult. In this work we introduce a number of improvements to existing work that aid in creating Robust Video Predictors (RoViPs). We show that with a combination of deep Perceptual and uncertainty-based reconstruction losses we are able to create high quality short-term predictions. Attention-based skip connections are utilised to allow for long range spatial movement of input features to further improve performance. Finally, we show that by simply making the predictor robust to its own prediction errors, it is possible to produce very long, realistic natural video sequences using an iterated single-step prediction task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes