AIMay 24
Solving Combinatorial Counting Problems with Weighted First-Order Model CountingYuanhong Wang, Juhua Pu, Yuxu Zhou et al.
Combinatorial counting problems pervade artificial intelligence, statistics, and discrete mathematics. Whether the task is enumerating subsets, multisets, permutations, partitions, or compositions under structural and arithmetic constraints, solving it remains a stubbornly manual exercise. Closed-form derivations are powerful but brittle, while naive encodings to propositional model counting or constraint satisfaction destroy the exchangeability that makes counting tractable in the first place. We present Cofola (COmbinatorial counting LAnguage with First-Order logic), a typed declarative language whose primitives are the combinatorial objects that recur in everyday counting questions, including sets, bags, tuples, sequences, circles, partitions, and compositions, together with natural relational and arithmetic constraints over them. A denotational semantics maps every Cofola program to a well-defined combinatorial counting problem, and a three-phase compilation pipeline (preprocessing, decomposition, and symmetry-preserving encoding) reduces this problem to a weighted first-order model counting (WFOMC) instance augmented with coefficient-extraction constraints. To stay inside known domain-liftable fragments whenever possible, the encoding groups indistinguishable entities, breaks the symmetry of unordered groupings lexicographically, and encodes sequences and circles via order axioms. On a suite of representative combinatorial counting problems, ranging from textbook math problems to multi-object scenarios that the closest prior framework cannot express, Cofola produces concise specifications and a uniform solving pipeline that is practical end-to-end.
AIAug 17, 2023
Lifted Algorithms for Symmetric Weighted First-Order Model SamplingYuanhong Wang, Juhua Pu, Yuyi Wang et al.
Weighted model counting (WMC) is the task of computing the weighted sum of all satisfying assignments (i.e., models) of a propositional formula. Similarly, weighted model sampling (WMS) aims to randomly generate models with probability proportional to their respective weights. Both WMC and WMS are hard to solve exactly, falling under the $\#\mathsf{P}$-hard complexity class. However, it is known that the counting problem may sometimes be tractable, if the propositional formula can be compactly represented and expressed in first-order logic. In such cases, model counting problems can be solved in time polynomial in the domain size, and are known as domain-liftable. The following question then arises: Is it also the case for weighted model sampling? This paper addresses this question and answers it affirmatively. Specifically, we prove the domain-liftability under sampling for the two-variables fragment of first-order logic with counting quantifiers in this paper, by devising an efficient sampling algorithm for this fragment that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of cardinality constraints. To empirically verify our approach, we conduct experiments over various first-order formulas designed for the uniform generation of combinatorial structures and sampling in statistical-relational models. The results demonstrate that our algorithm outperforms a start-of-the-art WMS sampler by a substantial margin, confirming the theoretical results.
AIFeb 6, 2023
On Exact Sampling in the Two-Variable Fragment of First-Order LogicYuanhong Wang, Juhua Pu, Yuyi Wang et al.
In this paper, we study the sampling problem for first-order logic proposed recently by Wang et al. -- how to efficiently sample a model of a given first-order sentence on a finite domain? We extend their result for the universally-quantified subfragment of two-variable logic $\mathbf{FO}^2$ ($\mathbf{UFO}^2$) to the entire fragment of $\mathbf{FO}^2$. Specifically, we prove the domain-liftability under sampling of $\mathbf{FO}^2$, meaning that there exists a sampling algorithm for $\mathbf{FO}^2$ that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of counting constraints, such as $\forall x\exists_{=k} y: \varphi(x,y)$ and $\exists_{=k} x\forall y: \varphi(x,y)$, for some quantifier-free formula $\varphi(x,y)$. Our proposed method is constructive, and the resulting sampling algorithms have potential applications in various areas, including the uniform generation of combinatorial structures and sampling in statistical-relational models such as Markov logic networks and probabilistic logic programs.
LOJul 16, 2024
Bridging Weighted First Order Model Counting and Graph PolynomialsQipeng Kuang, Ondřej Kuželka, Yuanhong Wang et al.
The Weighted First-Order Model Counting Problem (WFOMC) asks to compute the weighted sum of models of a given first-order logic sentence over a given domain. It can be solved in time polynomial in the domain size for sentences from the two-variable fragment with counting quantifiers, known as $C^2$. This polynomial-time complexity is known to be retained when extending $C^2$ by one of the following axioms: linear order axiom, tree axiom, forest axiom, directed acyclic graph axiom or connectedness axiom. An interesting question remains as to which other axioms can be added to the first-order sentences in this way. We provide a new perspective on this problem by associating WFOMC with graph polynomials. Using WFOMC, we define Weak Connectedness Polynomial and Strong Connectedness Polynomials for first-order logic sentences. It turns out that these polynomials have the following interesting properties. First, they can be computed in polynomial time in the domain size for sentences from $C^2$. Second, we can use them to solve WFOMC with all of the existing axioms known to be tractable as well as with new ones such as bipartiteness, strong connectedness, having $k$ connected components, etc. Third, the well-known Tutte polynomial can be recovered as a special case of the Weak Connectedness Polynomial, and the Strict and Non-Strict Directed Chromatic Polynomials can be recovered from the Strong Connectedness Polynomials.
LONov 12, 2025
Tractable Weighted First-Order Model Counting with Bounded Treewidth Binary EvidenceVáclav Kůla, Qipeng Kuang, Yuyi Wang et al.
The Weighted First-Order Model Counting Problem (WFOMC) asks to compute the weighted sum of models of a given first-order logic sentence over a given domain. Conditioning WFOMC on evidence -- fixing the truth values of a set of ground literals -- has been shown impossible in time polynomial in the domain size (unless $\mathsf{\#P \subseteq FP}$) even for fragments of logic that are otherwise tractable for WFOMC without evidence. In this work, we address the barrier by restricting the binary evidence to the case where the underlying Gaifman graph has bounded treewidth. We present a polynomial-time algorithm in the domain size for computing WFOMC for the two-variable fragments $\text{FO}^2$ and $\text{C}^2$ conditioned on such binary evidence. Furthermore, we show the applicability of our algorithm in combinatorial problems by solving the stable seating arrangement problem on bounded-treewidth graphs of bounded degree, which was an open problem. We also conducted experiments to show the scalability of our algorithm compared to the existing model counting solvers.
LOMay 12
On Knowledge Compilation For Two-Variable First-Order LogicQiaolan Meng, Juhua Pu, Hongting Niu et al.
Knowledge compilation transforms logical theories into circuit representations that support efficient reasoning. We study this problem for propositional groundings of FO2, the two-variable fragment of first-order logic over finite domains. Given an FO2 sentence and a domain of size n, its grounding yields a propositional theory over ground atoms. We ask whether such theories admit compact representations in DNNF-based and related knowledge compilation languages, and whether these can be constructed efficiently, both with respect to the domain size n for a fixed sentence. We show first that compact compilation is impossible in general: there exists an FO2 sentence whose grounding over a domain of size n requires DNNF size $2^{Ω(n)}$. On the positive side, we develop a two-stage compiler that exploits the symmetries inherent in the propositional groundings of FO2 sentences. It branches on unary and binary types rather than individual ground atoms, in a similar spirit to lifted inferences for probabilistic relational models. Moreover, it optimizes the compilation process by efficiently identifying and caching residual subproblems that are equivalent with respect to future extensions. Experiments show the practical efficiency of our approach, which often produces smaller circuits and compiles faster than straightforward grounding-based baselines.
LOMay 5
A Fast Model Counting Algorithm for Two-Variable Logic with Counting and Modulo Counting QuantifiersShixin Sun, Astrid Klipfel, Ondřej Kuželka et al.
Weighted first-order model counting (WFOMC) is a central task in lifted probabilistic inference: It asks for the weighted sum of all models of a first-order sentence over a finite domain. A long line of work has identified domain-liftable fragments of first-order logic, that is, syntactic classes for which WFOMC can be solved in time polynomial in the domain size. Among them, the two-variable fragment with counting quantifiers, $\mathbf{C}^2$, is one of the most expressive known liftable fragments. Existing algorithms for $\mathbf{C}^2$, however, establish tractability through multi-stage reductions that eliminate counting quantifiers via cardinality constraints, which introduces substantial practical overhead as the domain size grows. In this paper, we introduce IncrementalWFOMC3, a lifted algorithm for WFOMC on $\mathbf{C}^2$ and its modulo counting extension, $\mathbf{C}^2_{\text{mod}}$. Instead of relying on reduction techniques, IncrementalWFOMC3 operates directly on a Scott normal form that retains counting quantifiers throughout inference. This direct treatment yields two main results. First, we derive a tighter data-complexity bound for WFOMC in $\mathbf{C}^2$, reducing the degree of the polynomial from quadratic to linear in the counting parameters. Second, we prove that $\mathbf{C}^2_{\text{mod}}$ is domain-liftable, extending tractability from $\mathbf{C}^2$ to a richer fragment with native modulo counting support. Finally, our empirical evaluation shows that IncrementalWFOMC3 delivers orders-of-magnitude runtime improvements and better scalability than both existing WFOMC algorithms and state-of-the-art propositional model counters.
LOAug 15, 2025
Weighted First Order Model Counting for Two-variable Logic with Axioms on Two RelationsQipeng Kuang, Václav Kůla, Ondřej Kuželka et al.
The Weighted First-Order Model Counting Problem (WFOMC) asks to compute the weighted sum of models of a given first-order logic sentence over a given domain. The boundary between fragments for which WFOMC can be computed in polynomial time relative to the domain size lies between the two-variable fragment ($\text{FO}^2$) and the three-variable fragment ($\text{FO}^3$). It is known that WFOMC for \FOthree{} is $\mathsf{\#P_1}$-hard while polynomial-time algorithms exist for computing WFOMC for $\text{FO}^2$ and $\text{C}^2$, possibly extended by certain axioms such as the linear order axiom, the acyclicity axiom, and the connectedness axiom. All existing research has concentrated on extending the fragment with axioms on a single distinguished relation, leaving a gap in understanding the complexity boundary of axioms on multiple relations. In this study, we explore the extension of the two-variable fragment by axioms on two relations, presenting both negative and positive results. We show that WFOMC for $\text{FO}^2$ with two linear order relations and $\text{FO}^2$ with two acyclic relations are $\mathsf{\#P_1}$-hard. Conversely, we provide an algorithm in time polynomial in the domain size for WFOMC of $\text{C}^2$ with a linear order relation, its successor relation and another successor relation.
AIJul 25, 2025
Faster Lifting for Ordered Domains with Predecessor RelationsKuncheng Zou, Jiahao Mai, Yonggang Zhang et al.
We investigate lifted inference on ordered domains with predecessor relations, where the elements of the domain respect a total (cyclic) order, and every element has a distinct (clockwise) predecessor. Previous work has explored this problem through weighted first-order model counting (WFOMC), which computes the weighted sum of models for a given first-order logic sentence over a finite domain. In WFOMC, the order constraint is typically encoded by the linear order axiom introducing a binary predicate in the sentence to impose a linear ordering on the domain elements. The immediate and second predecessor relations are then encoded by the linear order predicate. Although WFOMC with the linear order axiom is theoretically tractable, existing algorithms struggle with practical applications, particularly when the predecessor relations are involved. In this paper, we treat predecessor relations as a native part of the axiom and devise a novel algorithm that inherently supports these relations. The proposed algorithm not only provides an exponential speedup for the immediate and second predecessor relations, which are known to be tractable, but also handles the general k-th predecessor relations. The extensive experiments on lifted inference tasks and combinatorics math problems demonstrate the efficiency of our algorithm, achieving speedups of a full order of magnitude.
LOMay 26, 2025
Model Enumeration of Two-Variable Logic with Quadratic Delay ComplexityQiaolan Meng, Juhua Pu, Hongting Niu et al.
We study the model enumeration problem of the function-free, finite domain fragment of first-order logic with two variables ($FO^2$). Specifically, given an $FO^2$ sentence $Γ$ and a positive integer $n$, how can one enumerate all the models of $Γ$ over a domain of size $n$? In this paper, we devise a novel algorithm to address this problem. The delay complexity, the time required between producing two consecutive models, of our algorithm is quadratic in the given domain size $n$ (up to logarithmic factors) when the sentence is fixed. This complexity is almost optimal since the interpretation of binary predicates in any model requires at least $Ω(n^2)$ bits to represent.
NIFeb 9, 2024
ForestColl: Throughput-Optimal Collective Communications on Heterogeneous Network FabricsLiangyu Zhao, Saeed Maleki, Yuanhong Wang et al.
As modern DNN models grow ever larger, collective communications between the accelerators (allreduce, etc.) emerge as a significant performance bottleneck. Designing efficient communication schedules is challenging, given today's heterogeneous and diverse network fabrics. We present ForestColl, a tool that generates throughput-optimal schedules for any network topology. ForestColl constructs broadcast/aggregation spanning trees as the communication schedule, achieving theoretical optimality. Its schedule generation runs in polynomial time and is highly scalable. ForestColl supports any network fabric, including both switching fabrics and direct accelerator connections. We evaluated ForestColl on AMD MI250 and NVIDIA DGX A100 & H100 clusters. ForestColl showed significant improvements over the vendors' own optimized communication libraries across various settings and in LLM training. ForestColl also outperformed other state-of-the-art schedule generation techniques with both more efficient generated schedules and substantially faster generation speed.
LGNov 12, 2017
On the ERM Principle with Networked DataYuanhong Wang, Yuyi Wang, Xingwu Liu et al.
Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d.\ assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.