Ranked MSO-enumeration over compressed words
This is the first result for ranked query enumeration on compressed data, addressing the problem of efficient enumeration over grammar-compressed strings for MSO-definable orders.
The paper shows that ranked MSO-query enumeration on strings given by straight-line programs can be solved with linear preprocessing and constant delay, generalizing previous results from uncompressed to compressed data. A corollary is that for a fixed polyregular function, the output symbols can be listed with linear preprocessing and constant delay.
It is shown that the ranked query enumeration problem for a fixed MSO-query on strings can be solved with linear preprocessing and constant delay in the grammar-compressed setting, where the input string is given by a so-called straight-line program, i.e., a context-free grammar that produces exactly one string. Moreover, `ranked' means that the output tuples of the MSO-query are printed in a specific order that has to be MSO-definable. This is the first result for ranked query enumeration on compressed data. A corollary of this result is that for a fixed polyregular function $f$ and a word $w$ that is given by a straight-line program of size $n$, one can list after preprocessing time $\mathcal{O}(n)$ the symbols in $f(w)$ from left to right with constant delay, which generalizes a result of Bojanczyk for the case where $w$ is uncompressed. The proofs for these results are based on factorization trees, which are made accessible to the grammar-compressed setting (a contribution of independent interest).