Prediction Under Uncertainty with Error-Encoding Networks
This addresses the problem of handling uncertainty in video prediction for AI applications, offering a simpler training approach, but it appears incremental as it builds on existing disentanglement and latent variable methods.
The paper tackles temporal prediction under uncertainty by disentangling predictable and unpredictable components, encoding the latter into a latent variable for a forward model, and shows consistent generation of diverse predictions on multiple video datasets without adversarial training.
In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty. It is based on a simple idea of disentangling components of the future state which are predictable from those which are inherently unpredictable, and encoding the unpredictable components into a low-dimensional latent variable which is fed into a forward model. Our method uses a supervised training objective which is fast and easy to train. We evaluate it in the context of video prediction on multiple datasets and show that it is able to consistently generate diverse predictions without the need for alternating minimization over a latent space or adversarial training.