CVAIJan 14, 2022

Learning from One and Only One Shot

arXiv:2201.08815v212 citations
AI Analysis

This addresses the challenge of data efficiency in AI for applications like visual recognition, offering a novel approach that mimics human innate priors, though it is incremental in combining cognitive insights with existing methods.

The paper tackles the problem of machine learning requiring large datasets by proposing a cognitively-inspired similarity model that achieves human-level recognition with only 1-10 examples per class and no pretraining, outperforming modern neural networks and classical ML on benchmarks like MNIST, EMNIST, Omniglot, and QuickDraw.

Humans can generalize from only a few examples and from little pretraining on similar tasks. Yet, machine learning (ML) typically requires large data to learn or pre-learn to transfer. Motivated by nativism and artificial general intelligence, we directly model human-innate priors in abstract visual tasks such as character and doodle recognition. This yields a white-box model that learns general-appearance similarity by mimicking how humans naturally ``distort'' an object at first sight. Using just nearest-neighbor classification on this cognitively-inspired similarity space, we achieve human-level recognition with only $1$--$10$ examples per class and no pretraining. This differs from few-shot learning that uses massive pretraining. In the tiny-data regime of MNIST, EMNIST, Omniglot, and QuickDraw benchmarks, we outperform both modern neural networks and classical ML. For unsupervised learning, by learning the non-Euclidean, general-appearance similarity space in a $k$-means style, we achieve multifarious visual realizations of abstract concepts by generating human-intuitive archetypes as cluster centroids.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes