AIMay 8

Exact Regular-Constrained Variable-Order Markov Generation via Sparse Context-State Belief Propagation

arXiv:2605.078393.6
Predicted impact top 88% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For researchers and practitioners using variable-order Markov models with regular constraints, this provides an exact method that avoids expanding to all K-tuples, addressing a known bottleneck in constrained generation.

The paper extends exact belief propagation for regular-constrained sequence generation from first-order to variable-order Markov models, achieving linear inference in the sequence horizon for fixed models and polynomial complexity in reachable product edges. It also supports reversible data augmentation without storing transformed corpora.

Variable-order Markov models generate sequences over a finite alphabet by conditioning each symbol on the longest available suffix of the generated history. Regular constraints, by contrast, describe finite-horizon control requirements by an automaton: fixed positions, forced endings, metrical patterns, and forbidden copied fragments are all special cases. Existing exact methods already handle regular constraints with belief propagation for first-order Markov chains. The contribution here is the variable-order extension: identifying the state space on which the existing BP-regular machinery must be run when the generator is a variable-order/backoff model. A first-order constraint layer can enforce useful support conditions, but it computes future mass after merging histories that a variable-order generator deliberately keeps distinct. We formalize this mismatch and give the sparse construction obtained by replacing the first-order Markov state with the observed context state, then taking the standard product with the regular constraint automaton. For a fixed trained context graph and automaton, inference is linear in the sequence horizon; in general it is polynomial in the number of reachable product edges. This gives the correct variable-order distribution conditioned on regular constraints without expanding to all K-tuples. The same finite-source interface supports reversible data augmentation by inverse count lookup, matching materialized transposition augmentation without storing transformed corpora. We also separate exact BP inference from generation-time backoff policies, such as singleton avoidance, whose stochastic semantics must be made explicit if exactness is claimed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes