CLFeb 27

The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

arXiv:2602.23928v11.11 citations

Originality Incremental advance

AI Analysis

This research addresses the problem of understanding linguistic structure and efficient language processing, with implications for both artificial and biological systems, though it is incremental in exploring LLM capabilities.

The study demonstrates that large language models can accurately reconstruct meaning from English texts where content words are replaced with nonsense strings, achieving translations that closely match the original in many cases.

We show that large language models (LLMs) have an astonishing ability to recover meaning from severely degraded English texts. Texts in which content words have been randomly substituted by nonsense strings, e.g., "At the ghybe of the swuint, we are haiveed to Wourge Phrear-gwurr, who sproles into an ghitch flount with his crurp", can be translated to conventional English that is, in many cases, close to the original text, e.g., "At the start of the story, we meet a man, Chow, who moves into an apartment building with his wife." These results show that structural cues (e.g., morphosyntax, closed-class words) constrain lexical meaning to a much larger degree than imagined. Although the abilities of LLMs to make sense of "Jabberwockified" English are clearly superhuman, they are highly relevant to understanding linguistic structure and suggest that efficient language processing either in biological or artificial systems likely benefits from very tight integration between syntax, lexical semantics, and general world knowledge.

View on arXiv PDF

Similar