CLApr 29

What Kind of Language is Easy to Language-Model Under Curriculum Learning?

Nadine El-Naggar, Tatsuki Kuribayashi, Ted Briscoe

arXiv:2604.2684462.5

AI Analysis

For researchers studying language typology and language model learning biases, this work shows that the learning scenario (curriculum vs. random) interacts with inductive bias, but the effect is incremental as it extends existing work with a simple CL variant.

This study investigates whether curriculum learning (starting with simpler sentences) affects the inductive bias of language models in reproducing typological patterns of word order. The authors find that curriculum learning substantially impacts the apparent inductive bias of LMs.

Many of the thousands of attested languages share common configurations of features, creating a spectrum from typologically very rare (e.g., object-verb-subject word order) or impossible languages to very common combinations of features (e.g., subject-object-verb word order). One central question is under what conditions such typological tendencies can be predicted, and specifically whether the learning bias of language models (LMs) is sufficient to reproduce such patterns. In this study, we add one dimensionality to such analysis -- the learning scenario for LMs -- to explore its interaction with the inductive bias of LMs. Specifically, as a first study, we examine the effect of curriculum learning (CL), as a developmentally motivated learning scenario, i.e., starting with simpler sentences rather than randomly-ordered input. We expand existing LM-based exploration (El-Naggar et al., 2025a,b) with a simple CL variant and find that CL substantially impacts the apparent inductive bias of LMs.

View on arXiv PDF

Similar