LOMay 22
A finer reparameterisation theorem for MSO and FO queries on stringsLê Thành Dũng Nguyên, Paweł Parys
We show a theorem on monadic second-order k-ary queries on finite words. It may be illustrated by the following example: if the number of results of a query on binary strings is O(number of 0s $\times$ number of 1s), then each result can be MSO-definably identified from a 0-position, a 1-position and some finite data. Our proofs also handle the case of first-order logic / aperiodic monoids. Thus we can state and prove the folklore theorem that dimension minimisation holds for first-order string-to-string interpretations.
LOMay 13
Subsumption in $\mathcal{FL}_{\bot \mathit{reg}}$ with TBoxes Is in ExpTimeMichał Henne, Barbara Morawska, Paweł Parys
Description logics (DL) are a family of formal languages for representing and reasoning about structured knowledge in terms of concepts and their relationships. A central reasoning problem in DL is concept subsumption. Although this problem has been widely studied, important open problems remain for certain logics. The expressive power of DLs depends on the constructors available for building complex concepts. In this work, we investigate subsumption in the restricted logic $\mathcal{FL}_{\bot \mathit{reg}}$ and its related fragments $\mathcal{FL}_\mathit{reg}$, $\mathcal{FL}_\bot$, and $\mathcal{FL}_0$. These logics support value restrictions over role names, where the subscript $\bot$ denotes the presence of the empty concept and ${reg}$ denotes the use of regular expressions over roles. None of these logics includes concept negation. We show that deciding subsumption between two concept descriptions in $\mathcal{FL}_{\bot \mathit{reg}}$ and $\mathcal{FL}_\mathit{reg}$ is PSpace-complete. When subsumption is considered with respect to a TBox (i.e., a set of axioms), the complexity increases to ExpTime-complete. Our results are obtained via a novel reduction to parity pushdown games.
AIOct 2, 2025
Constrained Adaptive Rejection SamplingPaweł Parys, Sairam Vaidya, Taylor Berg-Kirkpatrick et al.
Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as program fuzzing, where both validity and diversity of samples are essential. We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of RS without distributional distortion. CARS begins with unconstrained LM sampling and adaptively rules out constraint-violating continuations by recording them in a trie and subtracting their probability mass from future draws. This adaptive pruning ensures that prefixes proven invalid are never revisited, acceptance rates improve monotonically, and the resulting samples exactly follow the constrained distribution. In experiments on a variety of domains -- e.g., program fuzzing and molecular generation -- CARS consistently achieves higher efficiency -- measured in the number of LM forward passes per valid sample -- while also producing stronger sample diversity than both GCD and methods that approximate the LM's distribution.