CLAug 30, 2024

Simple stochastic processes behind Menzerath's Law

arXiv:2409.00279v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses a specific problem in linguistics by refining models for Menzerath's Law, representing an incremental improvement over existing approaches.

The paper tackles the problem of modeling Menzerath's Law, which describes the relationship between linguistic construct length and constituent length, by showing that simple stochastic processes, including a bivariate log-normal distribution and a Gaussian copula model, can accurately reflect real-world data, with the Gaussian copula providing improved accuracy.

This paper revisits Menzerath's Law, also known as the Menzerath-Altmann Law, which models a relationship between the length of a linguistic construct and the average length of its constituents. Recent findings indicate that simple stochastic processes can display Menzerathian behaviour, though existing models fail to accurately reflect real-world data. If we adopt the basic principle that a word can change its length in both syllables and phonemes, where the correlation between these variables is not perfect and these changes are of a multiplicative nature, we get bivariate log-normal distribution. The present paper shows, that from this very simple principle, we obtain the classic Altmann model of the Menzerath-Altmann Law. If we model the joint distribution separately and independently from the marginal distributions, we can obtain an even more accurate model by using a Gaussian copula. The models are confronted with empirical data, and alternative approaches are discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes