COITITMay 31

An extremal problem for completely unclustered Burrows-Wheeler images

arXiv:2606.0126721.4
Predicted impact top 56% in CO · last 90 daysOriginality Synthesis-oriented
AI Analysis

This work addresses a combinatorial extremal problem in stringology, providing theoretical bounds and constructions for completely unclustered BWT images, which is of interest to researchers in combinatorics on words and data compression.

The paper studies the extremal problem of minimizing the cyclic run number of a primitive necklace whose Burrows-Wheeler transform (BWT) is completely unclustered (i.e., has as many runs as positions). It proves a universal lower bound of ⌈n/2⌉ and determines exact values for small lengths, including the exceptional U_k(6)=4, and shows sharpness for infinitely many adjacent lengths under an Artin-type conjecture.

The Burrows--Wheeler transform is usually viewed as a clustering transform: it tends to group equal letters into long runs. We study the opposite extremal regime, where the BWT output is completely unclustered, that is, has as many equal-letter runs as positions. Known results imply, on the one hand, that the number of runs in the BWT of a Lyndon word can increase by at most a factor of two, and, on the other hand, that over every alphabet of size at least three completely unclustered BWT images exist in every length. This leads to the extremal problem lying between these two facts. For \(k\ge3\), let \(U_k(n)\) be the minimum cyclic run number of a primitive necklace of length \(n\) whose BWT has \(n\) runs. We prove the universal lower bound \(U_k(n)\ge\lceil n/2\rceil\), reduce the sharpness problem for one-cycle BWT images \(L\) to the Hamming identity \[ \cruns(\BWT^{-1}(L))=\dH(L,\sort(L)), \] and develop a natural multiset-of-necklaces relaxation with an explicit constant-cycle correction. We compute the small values, including the exceptional value \(U_k(6)=4\), prove a parity obstruction for the Parikh vectors of sharp examples, and determine the multiset relaxation exactly. Finally, for every prime \(p\equiv5\pmod8\) for which \(2\) is a primitive root modulo \(p\), we prove sharpness in the adjacent lengths \(p-1\) and \(p\). Under the corresponding Artin-type infinitude hypothesis, this gives infinitely many adjacent sharp pairs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes