Studies with impossible languages falsify LMs as models of human language
This work addresses the problem of evaluating LMs as models of human language for researchers in computational linguistics and cognitive science, highlighting a key limitation in current LM capabilities.
The paper argues that language models (LMs) can learn both attested and impossible languages equally well, challenging claims that LMs share human inductive biases for language acquisition, with evidence showing that difficult impossible languages are merely more complex or random.
According to Futrell and Mahowald [arXiv:2501.17047], both infants and language models (LMs) find attested languages easier to learn than impossible languages that have unnatural structures. We review the literature and show that LMs often learn attested and many impossible languages equally well. Difficult to learn impossible languages are simply more complex (or random). LMs are missing human inductive biases that support language acquisition.