CVLGNEApr 30, 2020

PreCNet: Next-Frame Video Prediction Based on Predictive Coding

arXiv:2004.14878v327 citations
AI Analysis

This work addresses video prediction for applications like autonomous driving, though it is incremental as it adapts an existing neuroscience model to a known task.

The authors tackled next-frame video prediction by adapting a neuroscience predictive coding model into a deep learning framework (PreCNet), achieving state-of-the-art performance on a benchmark with improvements in MSE, PSNR, and SSIM, especially when using a larger training set.

Predictive coding, currently a highly influential theory in neuroscience, has not been widely adopted in machine learning yet. In this work, we transform the seminal model of Rao and Ballard (1999) into a modern deep learning framework while remaining maximally faithful to the original schema. The resulting network we propose (PreCNet) is tested on a widely used next frame video prediction benchmark, which consists of images from an urban environment recorded from a car-mounted camera, and achieves state-of-the-art performance. Performance on all measures (MSE, PSNR, SSIM) was further improved when a larger training set (2M images from BDD100k), pointing to the limitations of the KITTI training set. This work demonstrates that an architecture carefully based in a neuroscience model, without being explicitly tailored to the task at hand, can exhibit exceptional performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes