Production vs Perception: The Role of Individuality in Usage-Based Grammar Induction
This addresses a theoretical issue in computational linguistics for researchers, but it is incremental as it builds on existing grammar induction methods.
The paper tackled the problem of whether production-based vs perception-based grammar induction affects grammar growth and similarity, finding that production-based grammars differ significantly with a steeper growth curve due to inter-individual differences.
This paper asks whether a distinction between production-based and perception-based grammar induction influences either (i) the growth curve of grammars and lexicons or (ii) the similarity between representations learned from independent sub-sets of a corpus. A production-based model is trained on the usage of a single individual, thus simulating the grammatical knowledge of a single speaker. A perception-based model is trained on an aggregation of many individuals, thus simulating grammatical generalizations learned from exposure to many different speakers. To ensure robustness, the experiments are replicated across two registers of written English, with four additional registers reserved as a control. A set of three computational experiments shows that production-based grammars are significantly different from perception-based grammars across all conditions, with a steeper growth curve that can be explained by substantial inter-individual grammatical differences.