AILGMay 20, 2022

Diversity vs. Recognizability: Human-like generalization in one-shot generative models

arXiv:2205.10370v313 citationsh-index: 45
Originality Incremental advance
AI Analysis

This work addresses the lack of appropriate metrics for comparing one-shot generative models to human generalization, which is important for researchers in AI and cognitive science, though it is incremental as it builds on existing models and datasets.

The authors tackled the problem of evaluating one-shot generative models by proposing a new framework based on sample recognizability and diversity, and found that GAN-like and VAE-like models occupy opposite ends of this trade-off, with specific parameters like spatial attention and disentanglement influencing performance to approximate human data.

Robust generalization to new concepts has long remained a distinctive feature of human intelligence. However, recent progress in deep generative models has now led to neural architectures capable of synthesizing novel instances of unknown visual concepts from a single training example. Yet, a more precise comparison between these models and humans is not possible because existing performance metrics for generative models (i.e., FID, IS, likelihood) are not appropriate for the one-shot generation scenario. Here, we propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity (i.e., intra-class variability). Using this framework, we perform a systematic evaluation of representative one-shot generative models on the Omniglot handwritten dataset. We first show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space. Extensive analyses of the effect of key model parameters further revealed that spatial attention and context integration have a linear contribution to the diversity-recognizability trade-off. In contrast, disentanglement transports the model along a parabolic curve that could be used to maximize recognizability. Using the diversity-recognizability framework, we were able to identify models and parameters that closely approximate human data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes