Zero-Episode Few-Shot Contrastive Predictive Coding: Solving intelligence tests without prior training
This work addresses the problem of reducing training data requirements for video prediction models, which is incremental as it builds on existing contrastive predictive coding methods.
The paper tackles the challenge of data-efficient video prediction by proposing a method that identifies predictive latent variables to evaluate future image consistency, eliminating the need for generative model training. It demonstrates that a one-dimensional Markov Contrastive Predictive Coding model solves sequence completion intelligence tests with only five examples and applies it to anomaly detection and stochastic movement video prediction without prior training.
Video prediction models often combine three components: an encoder from pixel space to a small latent space, a latent space prediction model, and a generative model back to pixel space. However, the large and unpredictable pixel space makes training such models difficult, requiring many training examples. We argue that finding a predictive latent variable and using it to evaluate the consistency of a future image enables data-efficient predictions because it precludes the necessity of a generative model training. To demonstrate it, we created sequence completion intelligence tests in which the task is to identify a predictably changing feature in a sequence of images and use this prediction to select the subsequent image. We show that a one-dimensional Markov Contrastive Predictive Coding (M-CPC_1D) model solves these tests efficiently, with only five examples. Finally, we demonstrate the usefulness of M-CPC_1D in solving two tasks without prior training: anomaly detection and stochastic movement video prediction.