CLAOOct 5, 2015

Stochastic model for phonemes uncovers an author-dependency of their usage

arXiv:1510.01315v24 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding linguistic patterns in phoneme usage for researchers in computational linguistics and text analysis, though it appears incremental as it builds on existing statistical models.

The authors tackled the problem of modeling phoneme usage in texts and found that rank-frequency relations for phonemes follow a Dirichlet distribution, revealing an author-dependent effect distinct from vocabulary, unlike words which follow Zipf's law and are author-independent.

We study rank-frequency relations for phonemes, the minimal units that still relate to linguistic meaning. We show that these relations can be described by the Dirichlet distribution, a direct analogue of the ideal-gas model in statistical mechanics. This description allows us to demonstrate that the rank-frequency relations for phonemes of a text do depend on its author. The author-dependency effect is not caused by the author's vocabulary (common words used in different texts), and is confirmed by several alternative means. This suggests that it can be directly related to phonemes. These features contrast to rank-frequency relations for words, which are both author and text independent and are governed by the Zipf's law.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes