LG CLAug 29, 2023

Large language models converge toward human-like concept organization

Mathias Lykke Gammelgaard, Jonathan Gabel Christiansen, Anders Søgaard

arXiv:2308.15047v13.83 citationsh-index: 46

Originality Incremental advance

AI Analysis

This addresses the debate on whether LLMs truly understand semantics or just memorize patterns, showing they can induce human-like knowledge from text.

The study found that large language models learn to organize concepts similarly to human-curated knowledge bases like WikiData, with bigger and better models showing more human-like organization across multiple model families and knowledge graph embeddings.

Large language models show human-like performance in knowledge extraction, reasoning and dialogue, but it remains controversial whether this performance is best explained by memorization and pattern matching, or whether it reflects human-like inferential semantics and world knowledge. Knowledge bases such as WikiData provide large-scale, high-quality representations of inferential semantics and world knowledge. We show that large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in such knowledge bases. Knowledge bases model collective, institutional knowledge, and large language models seem to induce such knowledge from raw text. We show that bigger and better models exhibit more human-like concept organization, across four families of language models and three knowledge graph embeddings.

View on arXiv PDF

Similar