CLFeb 9

Large Language Models and Impossible Language Acquisition: "False Promise" or an Overturn of our Current Perspective towards AI

arXiv:2602.08437v10.6h-index: 14

Originality Incremental advance

AI Analysis

This addresses a fundamental problem in AI and linguistics regarding the capabilities of LLMs, offering incremental insights by testing specific models against theoretical critiques.

The paper tackles the challenge of whether Large Language Models (LLMs) can learn impossible languages, as critiqued by Chomsky, and finds that GPT-2 models underperform on impossible languages compared to possible ones (p<.001), while LSTM models align with Chomsky's arguments, highlighting the role of transformer architecture.

In Chomsky's provocative critique "The False Promise of CHATGPT," Large Language Models (LLMs) are characterized as mere pattern predictors that do not acquire languages via intrinsic causal and self-correction structures like humans, therefore are not able to distinguish impossible languages. It stands as a representative in a fundamental challenge to the intellectual foundations of AI, for it integrally synthesizes major issues in methodologies within LLMs and possesses an iconic a priori rationalist perspective. We examine this famous critic from both the perspective in pre-existing literature of linguistics and psychology as well as a research based on an experiment inquiring the capacity of learning both possible and impossible languages among LLMs. We constructed a set of syntactically impossible languages by applying certain transformations to English. These include reversing whole sentences, and adding negation based on word-count parity. Two rounds of controlled experiments were each conducted on GPT-2 small models and long short-term memory (LSTM) models. Statistical analysis (Welch's t-test) shows GPT2 small models underperform in learning all of the impossible languages compared to their performance on the possible language (p<.001). On the other hand, LSTM models' performance tallies with Chomsky's argument, suggesting the irreplaceable role of the evolution of transformer architecture. Based on theoretical analysis and empirical findings, we propose a new vision within Chomsky's theory towards LLMs, and a shift of theoretical paradigm outside Chomsky, from his "rationalist-romantics" paradigm to functionalism and empiricism in LLMs research.

View on arXiv PDF

Similar