CLMay 16

Language Acquisition Device in Large Language Models

Masato Mita, Taiga Someya, Ryo Yoshida, Yohei Oseki

arXiv:2605.1675881.9

AI Analysis

For researchers in NLP and cognitive science, this work proposes a more effective pre-pretraining strategy that improves data efficiency and aligns LLMs with human-like language biases.

LLMs are less data-efficient than humans; pre-pretraining on MP-STRUCT, a formal language inspired by the Language Acquisition Device, matches strong baselines in token efficiency after 500 steps and imparts resistance to implausible languages, outperforming k-Shuffle Dyck despite not being definable in C-RASP.

Large Language Models (LLMs) remain substantially less data-efficient than humans. Pre-pretraining (PPT) on synthetic languages has been proposed to close this gap, with prior work emphasizing highly expressive formal languages such as $k$-Shuffle Dyck. Inspired by the Language Acquisition Device (LAD) hypothesis, which posits that innate constraints preemptively restrict the learner's hypothesis space to natural-language-like structure, we propose LAD-inspired PPT: pre-pretraining on MP-STRUCT, a formal language whose strings encode hierarchical composition, feature-based dependencies, and long-distance displacement via MERGE, AGREE, and MOVE. A brief 500-step PPT with MP-STRUCT matches strong formal-language baselines in token efficiency while additionally imparting a human-like resistance to structurally implausible languages (e.g., REVERSE). Analyzing simplified variants, we find that MP-STRUCT CORE outperforms $k$-Shuffle Dyck despite not being definable in C-RASP (a formal bound on transformer expressivity), challenging the prior hypothesis that effective PPT languages must be both hierarchically expressive and circuit-theoretically learnable. We show that functional landmarks, which reduce dependency resolution ambiguity, are a key driver, suggesting that effective PPT design depends not only on expressivity but also on the accessibility of dependency resolution.

View on arXiv PDF

Similar