Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents
This addresses the need for better evaluation benchmarks in AI language acquisition, though it is incremental as it builds on existing linguistic competence studies.
The paper tackles the problem of evaluating whether LLM agents can acquire a language through interactive feedback, unlike prior work, and finds they fail to establish conversations within 100 responses but show human-like learning strategies.
Existing evaluation studies on linguistic competence of large language models (LLM agents) have focused primarily on vocabulary learning, morphological rule induction, syntactic generalization, pragmatic inference, and cross-linguistic transfer. However, none assess whether LLM agents can acquire a language through pattern recognition and interactive feedback, a central feature of human language acquisition. We propose a novel experimental framework in which an LLM agent is evaluated on its ability to acquire and use a newly constructed language (Tinkatongue) in conversation with a bot that understands only Tinkatongue. Our findings show that LLM agents fail to establish a conversation within 100 responses, yet they adopt distinct strategies that mirror human approaches to language learning. The results suggest a new direction for evaluation benchmarks and open pathways to model designs that learn more effectively from interactive feedback.