CLOct 31, 2025

Identifying the Periodicity of Information in Natural Language

Yulin Ou, Yu Wang, Yang Xu, Hendrik Buschmeier

arXiv:2510.27241v14.92 citationsh-index: 18

Originality Incremental advance

AI Analysis

This addresses a theoretical question in linguistics and natural language processing, with potential applications in detecting LLM-generated text, but it is incremental as it builds on existing periodicity detection methods.

The paper tackled the problem of identifying periodicity patterns in the information encoded in natural language, finding that a considerable proportion of human language exhibits strong periodicity, including new periods beyond typical structural units like sentences.

Recent theoretical advancement of information density in natural language has brought the following question on desk: To what degree does natural language exhibit periodicity pattern in its encoded information? We address this question by introducing a new method called AutoPeriod of Surprisal (APS). APS adopts a canonical periodicity detection algorithm and is able to identify any significant periods that exist in the surprisal sequence of a single document. By applying the algorithm to a set of corpora, we have obtained the following interesting results: Firstly, a considerable proportion of human language demonstrates a strong pattern of periodicity in information; Secondly, new periods that are outside the distributions of typical structural units in text (e.g., sentence boundaries, elementary discourse units, etc.) are found and further confirmed via harmonic regression modeling. We conclude that the periodicity of information in language is a joint outcome from both structured factors and other driving factors that take effect at longer distances. The advantages of our periodicity detection method and its potentials in LLM-generation detection are further discussed.

View on arXiv PDF

Similar