Towards Using Diachronic Distributed Word Representations as Models of Lexical Development
This work addresses the challenge of understanding how children acquire language, though it is incremental as it builds on existing distributed representation methods with a temporal focus.
The paper tackled the problem of modeling lexical development in children by using diachronic distributed word representations from temporally sliced corpora, demonstrating effectiveness in tasks like lexical categorization and analyzing word acquisition rates.
Recent work has shown that distributed word representations can encode abstract information from child-directed speech. In this paper, we use diachronic distributed word representations to perform temporal modeling and analysis of lexical development in children. Unlike all previous work, we use temporally sliced corpus to learn distributed word representations of child-speech and child-directed speech under a curriculum-learning setting. In our experiments, we perform a lexical categorization task to plot the semantic and syntactic knowledge acquisition trajectories in children. Next, we perform linear mixed-effects modeling over the diachronic representational changes to study the role of input word frequencies in the rate of word acquisition in children. We also perform a fine-grained analysis of lexical knowledge transfer from adults to children using Representational Similarity Analysis. Finally, we perform a qualitative analysis of the diachronic representations from our model, which reveals the grounding and word associations in the mental lexicon of children. Our experiments demonstrate the ease of usage and effectiveness of diachronic distributed word representations in modeling lexical development.