Bernardo Subercaseaux

AI
h-index61
14papers
277citations
Novelty54%
AI Score54

14 Papers

LGJun 30, 2022
On Computing Probabilistic Explanations for Decision Trees

Marcelo Arenas, Pablo Barceló, Miguel Romero et al.

Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models. Inside formal XAI, one of the most studied cases is that of explaining the choices taken by decision trees, as they are traditionally deemed as one of the most interpretable classes of models. Recent work has focused on studying the computation of "sufficient reasons", a kind of explanation in which given a decision tree $T$ and an instance $x$, one explains the decision $T(x)$ by providing a subset $y$ of the features of $x$ such that for any other instance $z$ compatible with $y$, it holds that $T(z) = T(x)$, intuitively meaning that the features in $y$ are already enough to fully justify the classification of $x$ by $T$. It has been argued, however, that sufficient reasons constitute a restrictive notion of explanation, and thus the community has started to study their probabilistic counterpart, in which one requires that the probability of $T(z) = T(x)$ must be at least some value $δ\in (0, 1]$, where $z$ is a random instance that is compatible with $y$. Our paper settles the computational complexity of $δ$-sufficient-reasons over decision trees, showing that both (1) finding $δ$-sufficient-reasons that are minimal in size, and (2) finding $δ$-sufficient-reasons that are minimal inclusion-wise, do not admit polynomial-time algorithms (unless P=NP). This is in stark contrast with the deterministic case ($δ= 1$) where inclusion-wise minimal sufficient-reasons are easy to compute. By doing this, we answer two open problems originally raised by Izza et al. On the positive side, we identify structural restrictions of decision trees that make the problem tractable, and show how SAT solvers might be able to tackle these problems in practical settings.

CLFeb 3
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

David P. Woodruff, Vincent Cohen-Addad, Lalit Jain et al.

Recent advances in large language models (LLMs) have opened new avenues for accelerating scientific research. While models are increasingly capable of assisting with routine tasks, their ability to contribute to novel, expert-level mathematical discovery is less understood. We present a collection of case studies demonstrating how researchers have successfully collaborated with advanced AI models, specifically Google's Gemini-based models (in particular Gemini Deep Think and its advanced variants), to solve open problems, refute conjectures, and generate new proofs across diverse areas in theoretical computer science, as well as other areas such as economics, optimization, and physics. Based on these experiences, we extract common techniques for effective human-AI collaboration in theoretical research, such as iterative refinement, problem decomposition, and cross-disciplinary knowledge transfer. While the majority of our results stem from this interactive, conversational methodology, we also highlight specific instances that push beyond standard chat interfaces. These include deploying the model as a rigorous adversarial reviewer to detect subtle flaws in existing proofs, and embedding it within a "neuro-symbolic" loop that autonomously writes and executes code to verify complex derivations. Together, these examples highlight the potential of AI not just as a tool for automation, but as a versatile, genuine partner in the creative process of scientific discovery.

DMJan 23, 2023
The Packing Chromatic Number of the Infinite Square Grid is 15

Bernardo Subercaseaux, Marijn J. H. Heule

A packing $k$-coloring is a natural variation on the standard notion of graph $k$-coloring, where vertices are assigned numbers from $\{1, \ldots, k\}$, and any two vertices assigned a common color $c \in \{1, \ldots, k\}$ need to be at a distance greater than $c$ (as opposed to $1$, in standard graph colorings). Despite a sequence of incremental work, determining the packing chromatic number of the infinite square grid has remained an open problem since its introduction in 2002. We culminate the search by proving this number to be 15. We achieve this result by improving the best-known method for this problem by roughly two orders of magnitude. The most important technique to boost performance is a novel and surprisingly effective propositional encoding. Additionally, we developed a new symmetry-breaking method. Since both new techniques are more complex than existing techniques for this problem, a verified approach is required to trust them. We include both techniques in a proof of unsatisfiability, reducing the trusted core to the correctness of the direct encoding.

97.4COApr 23
Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery

Benjamin Przybocki, John Mackey, Marijn J. H. Heule et al.

Ramsey-good graphs are graphs that contain neither a clique of size $s$ nor an independent set of size $t$. We study doubly saturated Ramsey-good graphs, defined as Ramsey-good graphs in which the addition or removal of any edge necessarily creates an $s$-clique or a $t$-independent set. We present a method combining SAT solving with bespoke LLM-generated code to discover infinite families of such graphs, answering a question of Grinstead and Roberts from 1982. In addition, we use LLMs to generate and formalize correctness proofs in Lean. This case study highlights the potential of integrating automated reasoning, large language models, and formal verification to accelerate mathematical discovery. We argue that such tool-driven workflows will play an increasingly central role in experimental mathematics.

LOOct 18, 2023
A Uniform Language to Explain Decision Trees

Marcelo Arenas, Pablo Barcelo, Diego Bustamante et al.

The formal XAI community has studied a plethora of interpretability queries aiming to understand the classifications made by decision trees. However, a more uniform understanding of what questions we can hope to answer about these models, traditionally deemed to be easily interpretable, has remained elusive. In an initial attempt to understand uniform languages for interpretability, Arenas et al. (2021) proposed FOIL, a logic for explaining black-box ML models, and showed that it can express a variety of interpretability queries. However, we show that FOIL is limited in two important senses: (i) it is not expressive enough to capture some crucial queries, and (ii) its model agnostic nature results in a high computational complexity for decision trees. In this paper, we carefully craft two fragments of first-order logic that allow for efficiently interpreting decision trees: Q-DT-FOIL and its optimization variant OPT-DT-FOIL. We show that our proposed logics can express not only a variety of interpretability queries considered by previous literature, but also elegantly allows users to specify different objectives the sought explanations should optimize for. Using finite model-theoretic techniques, we show that the different ingredients of Q-DT-FOIL are necessary for its expressiveness, and yet that queries in Q-DT-FOIL can be evaluated with a polynomial number of queries to a SAT solver, as well as their optimization versions in OPT-DT-FOIL. Besides our theoretical results, we provide a SAT-based implementation of the evaluation for OPT-DT-FOIL that is performant on industry-size decision trees.

63.3CCMar 29
Automated Reencoding Meets Graph Theory

Benjamin Przybocki, Bernardo Subercaseaux, Marijn J. H. Heule

Bounded Variable Addition (BVA) is a central preprocessing method in modern state-of-the-art SAT solvers. We provide a graph-theoretic characterization of which 2-CNF encodings can be constructed by an idealized BVA algorithm. Based on this insight, we prove new results about the behavior and limitations of BVA and its interaction with other preprocessing techniques. We show that idealized BVA, plus some minor additional preprocessing (e.g., equivalent literal substitution), can reencode any 2-CNF formula with $n$ variables into an equivalent 2-CNF formula with $(\tfrac{\lg(3)}{4}+o(1))\,\tfrac{n^2}{\lg n}$ clauses. Furthermore, we show that without the additional preprocessing the constant factor worsens from $\tfrac{\lg(3)}{4} \approx 0.396$ to $1$, and that no reencoding method can achieve a constant below $0.25$. On the other hand, for the at-most-one constraint on $n$ variables, we prove that idealized BVA cannot reencode this constraint using fewer than $3n-6$ clauses, a bound that we prove is achieved by actual implementations. In particular, this shows that the product encoding for at-most-one, which uses $2n+o(n)$ clauses, cannot be constructed by BVA regardless of the heuristics used. Finally, our graph-theoretic characterization of BVA allows us to leverage recent work in algorithmic graph theory to develop a drastically more efficient implementation of BVA that achieves a comparable clause reduction on random monotone 2-CNF formulas.

11.2DMMay 2
Price of Locality in Permutation Mastermind: Are TikTok influencers Chaotic Enough?

Bernardo Subercaseaux

In the permutation Mastermind game, the goal is to uncover a secret permutation $σ^\star \colon [n] \to [n]$ by making a series of guesses $π_1, \ldots, π_T$ which must also be permutations of $[n]$, and receiving as feedback after guess $π_t$ the number of positions $i$ for which $σ^\star(i) = π_t(i)$. While the existing literature on permutation Mastermind suggests strategies in which $π_t$ and $π_{t+1}$ might be widely different permutations, a resurgence in popularity of this game as a TikTok trend shows that humans (or at least TikTok influencers) use strategies in which consecutive guesses are very similar. For example, it is common to see players attempt one transposition at a time and slowly see their score increase. Motivated by these observations, we study the theoretical impact of two forms of "locality" in permutation Mastermind strategies: $\ell_k$-local strategies, in which any two consecutive guesses differ in at most $k$ positions, and the even more restrictive class of $w_k$-local strategies, in which consecutive guesses differ in a window of length at most $k$. We show that, in broad terms, the optimal number of guesses for local strategies is quadratic, and thus much worse than the $O(n \lg n)$ guesses that suffice for non-local strategies. We also show NP-hardness of the satisfiability version for $\ell_3$-local strategies, whereas in the $\ell_2$-local variant the problem admits a randomized polynomial algorithm.

92.6CCMar 30
Near-Optimal Encodings of Cardinality Constraints

Andrew Krapivin, Benjamin Przybocki, Bernardo Subercaseaux

We present several novel encodings for cardinality constraints, which use fewer clauses than previous encodings and, more importantly, introduce new generally applicable techniques for constructing compact encodings. First, we present a CNF encoding for the $\text{AtMostOne}(x_1,\dots,x_n)$ constraint using $2n + 2 \sqrt{2n} + O(\sqrt[3]{n})$ clauses, thus refuting the conjectured optimality of Chen's product encoding. Our construction also yields a smaller monotone circuit for the threshold-2 function, improving on a 50-year-old construction of Adleman and incidentally solving a long-standing open problem in circuit complexity. On the other hand, we show that any encoding for this constraint requires at least $2n + \sqrt{n+1} - 2$ clauses, which is the first nontrivial unconditional lower bound for this constraint and answers a question of Kučera, Savický, and Vorel. We then turn our attention to encodings of $\text{AtMost}_k(x_1,\dots,x_n)$, where we introduce "grid compression", a technique inspired by hash tables, to give encodings using $2n + o(n)$ clauses as long as $k = o(\sqrt[3]{n})$ and $4n + o(n)$ clauses as long as $k = o(n)$. Previously, the smallest known encodings were of size $(k+1)n + o(n)$ for $k \le 5$ and $7n - o(n)$ for $k \ge 6$.

LGJan 10, 2025
Explaining k-Nearest Neighbors: Abductive and Counterfactual Explanations

Pablo Barceló, Alexander Kozachinskiy, Miguel Romero Orth et al.

Despite the wide use of $k$-Nearest Neighbors as classification models, their explainability properties remain poorly understood from a theoretical perspective. While nearest neighbors classifiers offer interpretability from a "data perspective", in which the classification of an input vector $\bar{x}$ is explained by identifying the vectors $\bar{v}_1, \ldots, \bar{v}_k$ in the training set that determine the classification of $\bar{x}$, we argue that such explanations can be impractical in high-dimensional applications, where each vector has hundreds or thousands of features and it is not clear what their relative importance is. Hence, we focus on understanding nearest neighbor classifications through a "feature perspective", in which the goal is to identify how the values of the features in $\bar{x}$ affect its classification. Concretely, we study abductive explanations such as "minimum sufficient reasons", which correspond to sets of features in $\bar{x}$ that are enough to guarantee its classification, and "counterfactual explanations" based on the minimum distance feature changes one would have to perform in $\bar{x}$ to change its classification. We present a detailed landscape of positive and negative complexity results for counterfactual and abductive explanations, distinguishing between discrete and continuous feature spaces, and considering the impact of the choice of distance function involved. Finally, we show that despite some negative complexity results, Integer Quadratic Programming and SAT solving allow for computing explanations in practice.

LOJun 16, 2025
Asymptotically Smaller Encodings for Graph Problems and Scheduling

Bernardo Subercaseaux

We show how several graph problems (e.g., vertex-cover, independent-set, $k$-coloring) can be encoded into CNF using only $O(|V|^2 / \lg |V|)$ many clauses, as opposed to the $Ω(|V|^2)$ constraints used by standard encodings. This somewhat surprising result is a simple consequence of a result of Erdős, Chung, and Spencer (1983) about biclique coverings of graphs, and opens theoretical avenues to understand the success of "Bounded Variable Addition'' (Manthey, Heule, and Biere, 2012) as a preprocessing tool. Finally, we show a novel encoding for independent sets in some dense interval graphs using only $O(|V| \lg |V|)$ clauses (the direct encoding uses $Ω(|V|^2)$), which we have successfully applied to a string-compression encoding posed by Bannai et al. (2022). As a direct byproduct, we obtain a reduction in the encoding size of a scheduling problem posed by Mayank and Modal (2020) from $O(NMT^2)$ to $O(NMT + M T^2 \lg T)$, where $N$ is the number of tasks, $T$ the total timespan, and $M$ the number of machines.

CGJun 1, 2025
Unfolding Boxes with Local Constraints

Long Qian, Eric Wang, Bernardo Subercaseaux et al.

We consider the problem of finding and enumerating polyominos that can be folded into multiple non-isomorphic boxes. While several computational approaches have been proposed, including SAT, randomized algorithms, and decision diagrams, none has been able to perform at scale. We argue that existing SAT encodings are hindered by the presence of global constraints (e.g., graph connectivity or acyclicity), which are generally hard to encode effectively and hard for solvers to reason about. In this work, we propose a new SAT-based approach that replaces these global constraints with simple local constraints that have substantially better propagation properties. Our approach dramatically improves the scalability of both computing and enumerating common box unfoldings: (i) while previous approaches could only find common unfoldings of two boxes up to area 88, ours easily scales beyond 150, and (ii) while previous approaches were only able to enumerate common unfoldings up to area 30, ours scales up to 60. This allows us to rule out 46, 54, and 58 as the smallest areas allowing a common unfolding of three boxes, thereby refuting a conjecture of Xu et al. (2017).

AIDec 30, 2024
Probabilistic Explanations for Linear Models

Bernardo Subercaseaux, Marcelo Arenas, Kuldeep S Meel

Formal XAI is an emerging field that focuses on providing explanations with mathematical guarantees for the decisions made by machine learning models. A significant amount of work in this area is centered on the computation of "sufficient reasons". Given a model $M$ and an input instance $\vec{x}$, a sufficient reason for the decision $M(\vec{x})$ is a subset $S$ of the features of $\vec{x}$ such that for any instance $\vec{z}$ that has the same values as $\vec{x}$ for every feature in $S$, it holds that $M(\vec{x}) = M(\vec{z})$. Intuitively, this means that the features in $S$ are sufficient to fully justify the classification of $\vec{x}$ by $M$. For sufficient reasons to be useful in practice, they should be as small as possible, and a natural way to reduce the size of sufficient reasons is to consider a probabilistic relaxation; the probability of $M(\vec{x}) = M(\vec{z})$ must be at least some value $δ\in (0,1]$, for a random instance $\vec{z}$ that coincides with $\vec{x}$ on the features in $S$. Computing small $δ$-sufficient reasons ($δ$-SRs) is known to be a theoretically hard problem; even over decision trees--traditionally deemed simple and interpretable models--strong inapproximability results make the efficient computation of small $δ$-SRs unlikely. We propose the notion of $(δ, ε)$-SR, a simple relaxation of $δ$-SRs, and show that this kind of explanation can be computed efficiently over linear models.

AIOct 5, 2021
Foundations of Symbolic Languages for Model Interpretability

Marcelo Arenas, Daniel Baez, Pablo Barceló et al.

Several queries and scores have recently been proposed to explain individual predictions over ML models. Given the need for flexible, reliable, and easy-to-apply interpretability methods for ML models, we foresee the need for developing declarative languages to naturally specify different explainability queries. We do this in a principled way by rooting such a language in a logic, called FOIL, that allows for expressing many simple but important explainability queries, and might serve as a core for more expressive interpretability languages. We study the computational complexity of FOIL queries over two classes of ML models often deemed to be easily interpretable: decision trees and OBDDs. Since the number of possible inputs for an ML model is exponential in its dimension, the tractability of the FOIL evaluation problem is delicate but can be achieved by either restricting the structure of the models or the fragment of FOIL being evaluated. We also present a prototype implementation of FOIL wrapped in a high-level declarative language and perform experiments showing that such a language can be used in practice.

AIOct 23, 2020
Model Interpretability through the Lens of Computational Complexity

Pablo Barceló, Mikaël Monet, Jorge Pérez et al.

In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class $\mathcal{C}_1$ of models is more interpretable than another class $\mathcal{C}_2$, if the computational complexity of answering post-hoc queries for models in $\mathcal{C}_2$ is higher than for those in $\mathcal{C}_1$. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P$\neq$NP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.