Fourier-based Video Prediction through Relational Object Motion
This addresses the problem of blurry predictions in autonomous systems, though it appears incremental as it builds on existing deep recurrent architectures.
The paper tackled video prediction by using frequency-domain methods and inferring object-motion relationships, resulting in predictions that are consistent with scene dynamics and avoid blur.
The ability to predict future outcomes conditioned on observed video frames is crucial for intelligent decision-making in autonomous systems. Recently, deep recurrent architectures have been applied to the task of video prediction. However, this often results in blurry predictions and requires tedious training on large datasets. Here, we explore a different approach by (1) using frequency-domain approaches for video prediction and (2) explicitly inferring object-motion relationships in the observed scene. The resulting predictions are consistent with the observed dynamics in a scene and do not suffer from blur.