Dense Optical Flow Prediction from a Static Image
This addresses a non-semantic action prediction problem for computer vision applications, but is incremental as it builds on existing CNN-based methods.
The paper tackles the problem of predicting dense optical flow from a single static image, using a CNN trained on tens of thousands of realistic videos without human labeling, and reports outperforming all previous approaches by large margins.
Given a scene, what is going to move, and in what direction will it move? Such a question could be considered a non-semantic form of action prediction. In this work, we present a convolutional neural network (CNN) based approach for motion prediction. Given a static image, this CNN predicts the future motion of each and every pixel in the image in terms of optical flow. Our CNN model leverages the data in tens of thousands of realistic videos to train our model. Our method relies on absolutely no human labeling and is able to predict motion based on the context of the scene. Because our CNN model makes no assumptions about the underlying scene, it can predict future optical flow on a diverse set of scenarios. We outperform all previous approaches by large margins.