CVAILGSep 10, 2024

Aligning Machine and Human Visual Representations across Abstraction Levels

DeepMindStanford
arXiv:2409.06509v44 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses the problem of making AI vision systems more human-aligned and robust for applications in cognitive science and machine learning, though it is incremental as it builds on existing foundation models.

The paper tackled the misalignment between neural network and human visual representations across abstraction levels by training a teacher model to imitate human judgments and transferring this structure to refine pretrained vision models, resulting in models that more accurately approximate human behavior and uncertainty across similarity tasks and improve generalization and out-of-distribution robustness on machine learning tasks.

Deep neural networks have achieved success across a wide range of applications, including as models of human behavior and neural representations in vision tasks. However, neural network training and human learning differ in fundamental ways, and neural networks often fail to generalize as robustly as humans do raising questions regarding the similarity of their underlying representations. What is missing for modern learning systems to exhibit more human-aligned behavior? We highlight a key misalignment between vision models and humans: whereas human conceptual knowledge is hierarchically organized from fine- to coarse-scale distinctions, model representations do not accurately capture all these levels of abstraction. To address this misalignment, we first train a teacher model to imitate human judgments, then transfer human-aligned structure from its representations to refine the representations of pretrained state-of-the-art vision foundation models via finetuning. These human-aligned models more accurately approximate human behavior and uncertainty across a wide range of similarity tasks, including a new dataset of human judgments spanning multiple levels of semantic abstractions. They also perform better on a diverse set of machine learning tasks, increasing generalization and out-of-distribution robustness. Thus, infusing neural networks with additional human knowledge yields a best-of-both-worlds representation that is both more consistent with human cognitive judgments and more practically useful, thus paving the way toward more robust, interpretable, and human-aligned artificial intelligence systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes