CL AIOct 30, 2024

Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies

Suchir Salhan, Richard Diehl Martinez, Zébulon Goriely, Paula Buttery

arXiv:2410.22886v29.614 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of making SSLMs more cognitively plausible for applications in language acquisition research, though it appears incremental as it builds on existing curriculum learning approaches.

The study tackled the problem of improving Small-Scale Language Models (SSLMs) by using fine-grained, acquisition-inspired curriculum learning strategies based on linguistic theories, finding that these curricula outperformed non-curriculum baselines in cross-lingual settings.

Curriculum Learning has been a popular strategy to improve the cognitive plausibility of Small-Scale Language Models (SSLMs) in the BabyLM Challenge. However, it has not led to considerable improvements over non-curriculum models. We assess whether theoretical linguistic acquisition theories can be used to specify more fine-grained curriculum learning strategies, creating age-ordered corpora of Child-Directed Speech for four typologically distant language families to implement SSLMs and acquisition-inspired curricula cross-lingually. Comparing the success of three objective curricula (Growing, Inwards and MMM) that precisely replicate the predictions of acquisition theories on a standard SSLM architecture, we find fine-grained acquisition-inspired curricula can outperform non-curriculum baselines and performance benefits of curricula strategies in SSLMs can be derived by specifying fine-grained language-specific curricula that precisely replicate language acquisition theories.

View on arXiv PDF Code

Similar