LGCVMLApr 17, 2018

PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

arXiv:1804.06300v2595 citations
Originality Incremental advance
AI Analysis

This work addresses a key challenge in video prediction for applications like autonomous driving or surveillance, though it is incremental as it builds on existing recurrent models.

The paper tackles the deep-in-time dilemma in spatiotemporal predictive learning by introducing PredRNN++, a recurrent network that uses Causal LSTM and Gradient Highway to improve gradient flow, resulting in state-of-the-art prediction results on synthetic and real video datasets, including in occlusion scenarios.

We present PredRNN++, an improved recurrent network for video predictive learning. In pursuit of a greater spatiotemporal modeling capability, our approach increases the transition depth between adjacent states by leveraging a novel recurrent unit, which is named Causal LSTM for re-organizing the spatial and temporal memories in a cascaded mechanism. However, there is still a dilemma in video predictive learning: increasingly deep-in-time models have been designed for capturing complex variations, while introducing more difficulties in the gradient back-propagation. To alleviate this undesirable effect, we propose a Gradient Highway architecture, which provides alternative shorter routes for gradient flows from outputs back to long-range inputs. This architecture works seamlessly with causal LSTMs, enabling PredRNN++ to capture short-term and long-term dependencies adaptively. We assess our model on both synthetic and real video datasets, showing its ability to ease the vanishing gradient problem and yield state-of-the-art prediction results even in a difficult objects occlusion scenario.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes