Learning to Linearize Under Uncertainty
This work addresses the challenge of principled unsupervised learning for deep hierarchies in computer vision, which is an incremental improvement over existing methods.
The paper tackles the problem of unsupervised training of deep feature hierarchies by proposing a new architecture and loss that linearizes transformations in unlabeled natural video sequences, using a generative model to predict frames and incorporating latent variables to handle prediction uncertainty.
Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision. However, a principled way in which to train such hierarchies in the unsupervised setting has remained elusive. In this work we suggest a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences. This is done by training a generative model to predict video frames. We also address the problem of inherent uncertainty in prediction by introducing latent variables that are non-deterministic functions of the input into the network architecture.