Marijn J. H. Heule

h-index32

10papers

111citations

Novelty46%

AI Score48

Ranked #29,243 of 194,257 authors (top 15%)#1,592 in AI (top 13%)

10 Papers

3.9AIMar 27, 2023

A Linear Weight Transfer Rule for Local Search

Md Solimul Chowdhury, Cayden R. Codel, Marijn J. H. Heule

The Divide and Distribute Fixed Weights algorithm (ddfw) is a dynamic local search SAT-solving algorithm that transfers weight from satisfied to falsified clauses in local minima. ddfw is remarkably effective on several hard combinatorial instances. Yet, despite its success, it has received little study since its debut in 2005. In this paper, we propose three modifications to the base algorithm: a linear weight transfer method that moves a dynamic amount of weight between clauses in local minima, an adjustment to how satisfied clauses are chosen in local minima to give weight, and a weighted-random method of selecting variables to flip. We implemented our modifications to ddfw on top of the solver yalsat. Our experiments show that our modifications boost the performance compared to the original ddfw algorithm on multiple benchmarks, including those from the past three years of SAT competitions. Moreover, our improved solver exclusively solves hard combinatorial instances that refute a conjecture on the lower bound of two Van der Waerden numbers set forth by Ahmed et al. (2014), and it performs well on a hard graph-coloring instance that has been open for over three decades.

2.7COApr 23

Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery

Benjamin Przybocki, John Mackey, Marijn J. H. Heule et al.

Ramsey-good graphs are graphs that contain neither a clique of size $s$ nor an independent set of size $t$. We study doubly saturated Ramsey-good graphs, defined as Ramsey-good graphs in which the addition or removal of any edge necessarily creates an $s$-clique or a $t$-independent set. We present a method combining SAT solving with bespoke LLM-generated code to discover infinite families of such graphs, answering a question of Grinstead and Roberts from 1982. In addition, we use LLMs to generate and formalize correctness proofs in Lean. This case study highlights the potential of integrating automated reasoning, large language models, and formal verification to accelerate mathematical discovery. We argue that such tool-driven workflows will play an increasingly central role in experimental mathematics.

8.2LOMay 20

Tao's Equational Proof Challenge Accepted (Technical Report)

Lydia Kondylidou, Jasmin Blanchette, Marijn J. H. Heule

In the context of the Equational Theories Project, Terence Tao posed the challenge of finding alternatives to a complicated 62-step proof found by the Vampire superposition prover. We introduce a proof minimization tool called Krympa. Using a combination of brute force and heuristics, and exploiting both Vampire and the Twee equational prover, the tool reduces the 62-step proof to 20 steps, each corresponding to a rewrite. In an empirical evaluation, it also performs well on 1431 equational problems originating from the same project, reducing in particular a 151-step proof to only 10 steps.

7.6CCMar 29

Automated Reencoding Meets Graph Theory

Benjamin Przybocki, Bernardo Subercaseaux, Marijn J. H. Heule

Bounded Variable Addition (BVA) is a central preprocessing method in modern state-of-the-art SAT solvers. We provide a graph-theoretic characterization of which 2-CNF encodings can be constructed by an idealized BVA algorithm. Based on this insight, we prove new results about the behavior and limitations of BVA and its interaction with other preprocessing techniques. We show that idealized BVA, plus some minor additional preprocessing (e.g., equivalent literal substitution), can reencode any 2-CNF formula with $n$ variables into an equivalent 2-CNF formula with $(\tfrac{\lg(3)}{4}+o(1))\,\tfrac{n^2}{\lg n}$ clauses. Furthermore, we show that without the additional preprocessing the constant factor worsens from $\tfrac{\lg(3)}{4} \approx 0.396$ to $1$, and that no reencoding method can achieve a constant below $0.25$. On the other hand, for the at-most-one constraint on $n$ variables, we prove that idealized BVA cannot reencode this constraint using fewer than $3n-6$ clauses, a bound that we prove is achieved by actual implementations. In particular, this shows that the product encoding for at-most-one, which uses $2n+o(n)$ clauses, cannot be constructed by BVA regardless of the heuristics used. Finally, our graph-theoretic characterization of BVA allows us to leverage recent work in algorithmic graph theory to develop a drastically more efficient implementation of BVA that achieves a comparable clause reduction on random monotone 2-CNF formulas.

2.3CGJun 1, 2025Code

Unfolding Boxes with Local Constraints

Long Qian, Eric Wang, Bernardo Subercaseaux et al.

We consider the problem of finding and enumerating polyominos that can be folded into multiple non-isomorphic boxes. While several computational approaches have been proposed, including SAT, randomized algorithms, and decision diagrams, none has been able to perform at scale. We argue that existing SAT encodings are hindered by the presence of global constraints (e.g., graph connectivity or acyclicity), which are generally hard to encode effectively and hard for solvers to reason about. In this work, we propose a new SAT-based approach that replaces these global constraints with simple local constraints that have substantially better propagation properties. Our approach dramatically improves the scalability of both computing and enumerating common box unfoldings: (i) while previous approaches could only find common unfoldings of two boxes up to area 88, ours easily scales beyond 150, and (ii) while previous approaches were only able to enumerate common unfoldings up to area 30, ours scales up to 60. This allows us to rule out 46, 54, and 58 as the smallest areas allowing a common unfolding of three boxes, thereby refuting a conjecture of Xu et al. (2017).

7.2CRSep 21, 2020Code

Modeling Techniques for Logic Locking

Joseph Sweeney, Marijn J. H. Heule, Lawrence Pileggi

Logic locking is a method to prevent intellectual property (IP) piracy. However, under a reasonable attack model, SAT-based methods have proven to be powerful in obtaining the secret key. In response, many locking techniques have been developed to specifically resist this form of attack. In this paper, we demonstrate two SAT modeling techniques that can provide many orders of magnitude speed up in discovering the correct key. Specifically, we consider relaxed encodings and symmetry breaking. To demonstrate their impact, we model and attack a state-of-the-art logic locking technique, Full-Lock. We show that circuits previously unbreakable within 15 days of run time can be solved in seconds. Consequently, in assessing the strength of any given locking, it is imperative that these modeling techniques be considered. To remedy this vulnerability in the considered locking technique, we demonstrate an extended version, logic-enhanced Banyan locking, that is resistant to our proposed modeling techniques.

2.5CRJan 15, 2017

Static Detection of DoS Vulnerabilities in Programs that use Regular Expressions (Extended Version)

Valentin Wüstholz, Oswaldo Olivo, Marijn J. H. Heule et al.

In an algorithmic complexity attack, a malicious party takes advantage of the worst-case behavior of an algorithm to cause denial-of-service. A prominent algorithmic complexity attack is regular expression denial-of-service (ReDoS), in which the attacker exploits a vulnerable regular expression by providing a carefully-crafted input string that triggers worst-case behavior of the matching algorithm. This paper proposes a technique for automatically finding ReDoS vulnerabilities in programs. Specifically, our approach automatically identifies vulnerable regular expressions in the program and determines whether an "evil" input string can be matched against a vulnerable regular expression. We have implemented our proposed approach in a tool called REXPLOITER and found 41 exploitable security vulnerabilities in Java web applications.

1.2DSFeb 18, 2014

Concurrent Cube-and-Conquer

Peter van der Tak, Marijn J. H. Heule, Armin Biere

Recent work introduced the cube-and-conquer technique to solve hard SAT instances. It partitions the search space into cubes using a lookahead solver. Each cube is tackled by a conflict-driven clause learning (CDCL) solver. Crucial for strong performance is the cutoff heuristic that decides when to switch from lookahead to CDCL. Yet, this offline heuristic is far from ideal. In this paper, we present a novel hybrid solver that applies the cube and conquer steps simultaneously. A lookahead and a CDCL solver work together on each cube, while communication is restricted to synchronization. Our concurrent cube-and-conquer solver can solve many instances faster than pure lookahead, pure CDCL and offline cube-and-conquer, and can abort early in favor of a pure CDCL search if an instance is not suitable for cube-and-conquer techniques.

2.3DSFeb 18, 2014

Symbiosis of Search and Heuristics for Random 3-SAT

Sid Mijnders, Boris de Wilde, Marijn Heule

When combined properly, search techniques can reveal the full potential of sophisticated branching heuristics. We demonstrate this observation on the well-known class of random 3-SAT formulae. First, a new branching heuristic is presented, which generalizes existing work on this class. Much smaller search trees can be constructed by using this heuristic. Second, we introduce a variant of discrepancy search, called ALDS. Theoretical and practical evidence support that ALDS traverses the search tree in a near-optimal order when combined with the new heuristic. Both techniques, search and heuristic, have been implemented in the look-ahead solver march. The SAT 2009 competition results show that march is by far the strongest complete solver on random k-SAT formulae.

9.1AIFeb 18, 2014

Towards Ultra Rapid Restarts

Shai Haim, Marijn Heule

We observe a trend regarding restart strategies used in SAT solvers. A few years ago, most state-of-the-art solvers restarted on average after a few thousands of backtracks. Currently, restarting after a dozen backtracks results in much better performance. The main reason for this trend is that heuristics and data structures have become more restart-friendly. We expect further continuation of this trend, so future SAT solvers will restart even more rapidly. Additionally, we present experimental results to support our observations.