CLMar 24, 2025

Unsupervised Acquisition of Discrete Grammatical Categories

arXiv:2503.18702v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of how abstract grammatical knowledge can be learned without supervision, which is incremental as it builds on existing computational linguistics methods.

The paper tackles the problem of unsupervised acquisition of discrete grammatical categories by using a multi-agent system where a daughter agent learns from language exemplars generated by a mother agent, demonstrating that statistical analyses yield grammatical rules and non-trivial categories are acquired, validated with a test set.

This article presents experiments performed using a computational laboratory environment for language acquisition experiments. It implements a multi-agent system consisting of two agents: an adult language model and a daughter language model that aims to learn the mother language. Crucially, the daughter agent does not have access to the internal knowledge of the mother language model but only to the language exemplars the mother agent generates. These experiments illustrate how this system can be used to acquire abstract grammatical knowledge. We demonstrate how statistical analyses of patterns in the input data corresponding to grammatical categories yield discrete grammatical rules. These rules are subsequently added to the grammatical knowledge of the daughter language model. To this end, hierarchical agglomerative cluster analysis was applied to the utterances consecutively generated by the mother language model. It is argued that this procedure can be used to acquire structures resembling grammatical categories proposed by linguists for natural languages. Thus, it is established that non-trivial grammatical knowledge has been acquired. Moreover, the parameter configuration of this computational laboratory environment determined using training data generated by the mother language model is validated in a second experiment with a test set similarly resulting in the acquisition of non-trivial categories.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes