AICLCVHCLGJul 1, 2024

Human-like object concept representations emerge naturally in multimodal large language models

arXiv:2407.01067v338 citationsh-index: 21
AI Analysis

This work addresses the problem of understanding machine intelligence and developing more human-like AI systems, though it is incremental in building on existing LLM research.

The study investigated whether large language models (LLMs) and multimodal LLMs (MLLMs) develop human-like object concept representations, finding that their 66-dimensional embeddings from 4.7 million triplet judgments on 1,854 objects showed semantic clustering and neural alignment similar to human cognition.

Understanding how humans conceptualize and categorize natural objects offers critical insights into perception and cognition. With the advent of Large Language Models (LLMs), a key question arises: can these models develop human-like object representations from linguistic and multimodal data? In this study, we combined behavioral and neuroimaging analyses to explore the relationship between object concept representations in LLMs and human cognition. We collected 4.7 million triplet judgments from LLMs and Multimodal LLMs (MLLMs) to derive low-dimensional embeddings that capture the similarity structure of 1,854 natural objects. The resulting 66-dimensional embeddings were stable, predictive, and exhibited semantic clustering similar to human mental representations. Remarkably, the dimensions underlying these embeddings were interpretable, suggesting that LLMs and MLLMs develop human-like conceptual representations of objects. Further analysis showed strong alignment between model embeddings and neural activity patterns in brain regions such as EBA, PPA, RSC, and FFA. This provides compelling evidence that the object representations in LLMs, while not identical to human ones, share fundamental similarities that reflect key aspects of human conceptual knowledge. Our findings advance the understanding of machine intelligence and inform the development of more human-like artificial cognitive systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes