SOC-PHCLAPAug 31, 2024

Statistics of punctuation in experimental literature -- the remarkable case of "Finnegans Wake" by James Joyce

arXiv:2409.00483v18 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This research identifies unique statistical properties in experimental texts, particularly Joyce's 'Finnegans Wake', which may interest linguists and computational text analysts, though it is incremental in extending prior work on punctuation universality.

The study analyzed punctuation patterns in experimental literature, finding that distances between punctuation marks generally follow a discrete Weibull distribution, but James Joyce's works, especially 'Finnegans Wake', exhibit thicker distribution tails and decreasing hazard functions, with sentence lengths showing multifractality.

As the recent studies indicate, the structure imposed onto written texts by the presence of punctuation develops patterns which reveal certain characteristics of universality. In particular, based on a large collection of classic literary works, it has been evidenced that the distances between consecutive punctuation marks, measured in terms of the number of words, obey the discrete Weibull distribution - a discrete variant of a distribution often used in survival analysis. The present work extends the analysis of punctuation usage patterns to more experimental pieces of world literature. It turns out that the compliance of the the distances between punctuation marks with the discrete Weibull distribution typically applies here as well. However, some of the works by James Joyce are distinct in this regard - in the sense that the tails of the relevant distributions are significantly thicker and, consequently, the corresponding hazard functions are decreasing functions not observed in typical literary texts in prose. "Finnegans Wake" - the same one to which science owes the word "quarks" for the most fundamental constituents of matter - is particularly striking in this context. At the same time, in all the studied texts, the sentence lengths - representing the distances between sentence-ending punctuation marks - reveal more freedom and are not constrained by the discrete Weibull distribution. This freedom in some cases translates into long-range nonlinear correlations, which manifest themselves in multifractality. Again, a text particularly spectacular in terms of multifractality is "Finnegans Wake".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes