CVDec 1, 2021

The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image

arXiv:2112.00725v48.79 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of data efficiency in machine learning by demonstrating that a single image can serve as a strong prior for training across multiple domains, though it is incremental in building on knowledge distillation and augmentation techniques.

The paper tackles the problem of training neural networks from scratch using only a single image and augmentations, achieving accuracies such as 94% on CIFAR-10, 69% on ImageNet, 51% on Kinetics-400, and 84% on SpeechCommands.

What can neural networks learn about the visual world when provided with only a single image as input? While any image obviously cannot contain the multitudes of all existing objects, scenes and lighting conditions - within the space of all 256^(3x224x224) possible 224-sized square images, it might still provide a strong prior for natural images. To analyze this `augmented image prior' hypothesis, we develop a simple framework for training neural networks from scratch using a single image and augmentations using knowledge distillation from a supervised pretrained teacher. With this, we find the answer to the above question to be: `surprisingly, a lot'. In quantitative terms, we find accuracies of 94%/74% on CIFAR-10/100, 69% on ImageNet, and by extending this method to video and audio, 51% on Kinetics-400 and 84% on SpeechCommands. In extensive analyses spanning 13 datasets, we disentangle the effect of augmentations, choice of data and network architectures and also provide qualitative evaluations that include lucid `panda neurons' in networks that have never even seen one.

View on arXiv PDF Code

Similar