DATA-AN STAT-MECH CLFeb 18, 2013

Explaining Zipf's Law via Mental Lexicon

Armen E. Allahverdyan, Weibing Deng, Q. A. Wang

arXiv:1302.4383v120 citations

Originality Synthesis-oriented

AI Analysis

This provides a theoretical explanation for a foundational statistical law in linguistics, potentially impacting natural sciences, but it appears incremental as it builds on existing probabilistic models.

The authors tackled the problem of explaining Zipf's law in linguistics by deriving it from random word probabilities, linking it to the mental lexicon of the author, and showed it applies to single texts and generalizations across frequencies.

The Zipf's law is the major regularity of statistical linguistics that served as a prototype for rank-frequency relations and scaling laws in natural sciences. Here we show that the Zipf's law -- together with its applicability for a single text and its generalizations to high and low frequencies including hapax legomena -- can be derived from assuming that the words are drawn into the text with random probabilities. Their apriori density relates, via the Bayesian statistics, to general features of the mental lexicon of the author who produced the text.

View on arXiv PDF

Similar