CLAIJan 16, 2023

Dissociating language and thought in large language models

arXiv:2301.06627v3473 citationsh-index: 62
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of evaluating whether LLMs can use language in human-like ways, which is crucial for AI researchers and developers, though it is incremental as it builds on existing neuroscience distinctions without introducing new methods.

The paper tackles the problem of assessing the linguistic and cognitive capabilities of Large Language Models (LLMs) by distinguishing between formal linguistic competence (knowledge of rules) and functional linguistic competence (using language in the world), finding that LLMs excel at formal competence but perform inconsistently on functional tasks, often requiring additional fine-tuning or external modules.

Large Language Models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence -- knowledge of linguistic rules and patterns -- and functional linguistic competence -- understanding and using language in the world. We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of mechanisms specialized for formal linguistic competence, distinct from functional competence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes