CLMar 17, 2025

Modelling Child Learning and Parsing of Long-range Syntactic Dependencies

Louis Mahon, Mark Johnson, Mark Steedman

arXiv:2503.12832v11 citationsh-index: 2Cognition

Originality Incremental advance

AI Analysis

This addresses the problem of understanding how children acquire complex language structures, but it is incremental as it builds on existing probabilistic models.

The authors developed a probabilistic child language acquisition model that learns word meanings and syntax from real child-directed speech, successfully deducing parse trees and meanings, including for long-range syntactic dependencies like object wh-questions.

This work develops a probabilistic child language acquisition model to learn a range of linguistic phenonmena, most notably long-range syntactic dependencies of the sort found in object wh-questions, among other constructions. The model is trained on a corpus of real child-directed speech, where each utterance is paired with a logical form as a meaning representation. It then learns both word meanings and language-specific syntax simultaneously. After training, the model can deduce the correct parse tree and word meanings for a given utterance-meaning pair, and can infer the meaning if given only the utterance. The successful modelling of long-range dependencies is theoretically important because it exploits aspects of the model that are, in general, trans-context-free.

View on arXiv PDF

Similar