Peter J. Stuckey

AI
h-index44
38papers
1,066citations
Novelty51%
AI Score53

38 Papers

AIJun 20, 2022
Eliminating The Impossible, Whatever Remains Must Be True

Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey et al.

The rise of AI methods to make predictions and decisions has led to a pressing need for more explainable artificial intelligence (XAI) methods. One common approach for XAI is to produce a post-hoc explanation, explaining why a black box ML model made a certain prediction. Formal approaches to post-hoc explanations provide succinct reasons for why a prediction was made, as well as why not another prediction was made. But these approaches assume that features are independent and uniformly distributed. While this means that "why" explanations are correct, they may be longer than required. It also means the "why not" explanations may be suspect as the counterexamples they rely on may not be meaningful. In this paper, we show how one can apply background knowledge to give more succinct "why" formal explanations, that are presumably easier to interpret by humans, and give more accurate "why not" explanations. In addition, we show how to use existing rule induction techniques to efficiently extract background information from a dataset, and also how to report which background information was used to make an explanation, allowing a human to examine it if they doubt the correctness of the explanation.

AIJul 7, 2023
On Formal Feature Attribution and Its Approximation

Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey

Recent years have witnessed the widespread use of artificial intelligence (AI) algorithms and machine learning (ML) models. Despite their tremendous success, a number of vital problems like ML model brittleness, their fairness, and the lack of interpretability warrant the need for the active developments in explainable artificial intelligence (XAI) and formal ML model verification. The two major lines of work in XAI include feature selection methods, e.g. Anchors, and feature attribution techniques, e.g. LIME and SHAP. Despite their promise, most of the existing feature selection and attribution approaches are susceptible to a range of critical issues, including explanation unsoundness and out-of-distribution sampling. A recent formal approach to XAI (FXAI) although serving as an alternative to the above and free of these issues suffers from a few other limitations. For instance and besides the scalability limitation, the formal approach is unable to tackle the feature attribution problem. Additionally, a formal explanation despite being formally sound is typically quite large, which hampers its applicability in practical settings. Motivated by the above, this paper proposes a way to apply the apparatus of formal XAI to the case of feature attribution based on formal explanation enumeration. Formal feature attribution (FFA) is argued to be advantageous over the existing methods, both formal and non-formal. Given the practical complexity of the problem, the paper then proposes an efficient technique for approximating exact FFA. Finally, it offers experimental evidence of the effectiveness of the proposed approximate FFA in comparison to the existing feature attribution algorithms not only in terms of feature importance and but also in terms of their relative order.

AIJan 27Code
MATA: A Trainable Hierarchical Automaton System for Multi-Agent Visual Reasoning

Zhixi Cai, Fucai Ke, Kevin Leo et al.

Recent vision-language models have strong perceptual ability but their implicit reasoning is hard to explain and easily generates hallucinations on complex queries. Compositional methods improve interpretability, but most rely on a single agent or hand-crafted pipeline and cannot decide when to collaborate across complementary agents or compete among overlapping ones. We introduce MATA (Multi-Agent hierarchical Trainable Automaton), a multi-agent system presented as a hierarchical finite-state automaton for visual reasoning whose top-level transitions are chosen by a trainable hyper agent. Each agent corresponds to a state in the hyper automaton, and runs a small rule-based sub-automaton for reliable micro-control. All agents read and write a shared memory, yielding transparent execution history. To supervise the hyper agent's transition policy, we build transition-trajectory trees and transform to memory-to-next-state pairs, forming the MATA-SFT-90K dataset for supervised finetuning (SFT). The finetuned LLM as the transition policy understands the query and the capacity of agents, and it can efficiently choose the optimal agent to solve the task. Across multiple visual reasoning benchmarks, MATA achieves the state-of-the-art results compared with monolithic and compositional baselines. The code and dataset are available at https://github.com/ControlNet/MATA.

AIAug 22, 2023
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding

Zhe Chen, Daniel Harabor, Jiaoyang Li et al.

Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics that asks us to compute collision-free paths for a team of agents, all moving across a shared map. Although many works appear on this topic, all current algorithms struggle as the number of agents grows. The principal reason is that existing approaches typically plan free-flow optimal paths, which creates congestion. To tackle this issue, we propose a new approach for MAPF where agents are guided to their destination by following congestion-avoiding paths. We evaluate the idea in two large-scale settings: one-shot MAPF, where each agent has a single destination, and lifelong MAPF, where agents are continuously assigned new destinations. Empirically, we report large improvements in solution quality for one-short MAPF and in overall throughput for lifelong MAPF.

25.1AIMar 31
Rigorous Explanations for Tree Ensembles

Yacine Izza, Alexey Ignatiev, Xuanxiang Huang et al.

Tree ensembles (TEs) find a multitude of practical applications. They represent one of the most general and accurate classes of machine learning methods. While they are typically quite concise in representation, their operation remains inscrutable to human decision makers. One solution to build trust in the operation of TEs is to automatically identify explanations for the predictions made. Evidently, we can only achieve trust using explanations, if those explanations are rigorous, that is truly reflect properties of the underlying predictor they explain This paper investigates the computation of rigorously-defined, logically-sound explanations for the concrete case of two well-known examples of tree ensembles, namely random forests and boosted trees.

QMAug 11, 2023
The divergence time of protein structures modelled by Markov matrices and its relation to the divergence of sequences

Sandun Rajapaksa, Lloyd Allison, Peter J. Stuckey et al.

A complete time-parameterized statistical model quantifying the divergent evolution of protein structures in terms of the patterns of conservation of their secondary structures is inferred from a large collection of protein 3D structure alignments. This provides a better alternative to time-parameterized sequence-based models of protein relatedness, that have clear limitations dealing with twilight and midnight zones of sequence relationships. Since protein structures are far more conserved due to the selection pressure directly placed on their function, divergence time estimates can be more accurate when inferred from structures. We use the Bayesian and information-theoretic framework of Minimum Message Length to infer a time-parameterized stochastic matrix (accounting for perturbed structural states of related residues) and associated Dirichlet models (accounting for insertions and deletions during the evolution of protein domains). These are used in concert to estimate the Markov time of divergence of tertiary structures, a task previously only possible using proxies (like RMSD). By analyzing one million pairs of homologous structures, we yield a relationship between the Markov divergence time of structures and of sequences. Using these inferred models and the relationship between the divergence of sequences and structures, we demonstrate a competitive performance in secondary structure prediction against neural network architectures commonly employed for this task. The source code and supplementary information are downloadable from \url{http://lcb.infotech.monash.edu.au/sstsum}.

AIJul 17, 2023
Lifted Sequential Planning with Lazy Constraint Generation Solvers

Anubhav Singh, Miquel Ramirez, Nir Lipovetzky et al.

This paper studies the possibilities made open by the use of Lazy Clause Generation (LCG) based approaches to Constraint Programming (CP) for tackling sequential classical planning. We propose a novel CP model based on seminal ideas on so-called lifted causal encodings for planning as satisfiability, that does not require grounding, as choosing groundings for functions and action schemas becomes an integral part of the problem of designing valid plans. This encoding does not require encoding frame axioms, and does not explicitly represent states as decision variables for every plan step. We also present a propagator procedure that illustrates the possibilities of LCG to widen the kind of inference methods considered to be feasible in planning as (iterated) CSP solving. We test encodings and propagators over classic IPC and recently proposed benchmarks for lifted planning, and report that for planning problem instances requiring fewer plan steps our methods compare very well with the state-of-the-art in optimal sequential planning.

ROJun 28, 2023
Reducing Redundant Work in Jump Point Search

Shizhe Zhao, Daniel Harabor, Peter J. Stuckey

JPS (Jump Point Search) is a state-of-the-art optimal algorithm for online grid-based pathfinding. Widely used in games and other navigation scenarios, JPS nevertheless can exhibit pathological behaviours which are not well studied: (i) it may repeatedly scan the same area of the map to find successors; (ii) it may generate and expand suboptimal search nodes. In this work, we examine the source of these pathological behaviours, show how they can occur in practice, and propose a purely online approach, called Constrained JPS (CJPS), to tackle them efficiently. Experimental results show that CJPS has low overheads and is often faster than JPS in dynamically changing grid environments: by up to 7x in large game maps and up to 14x in pathological scenarios.

CYMar 19, 2025
3+ Seat Risk-Limiting Audits for Single Transferable Vote Elections

Michelle Blom, Alexander Ek, Peter J. Stuckey et al.

Constructing efficient risk-limiting audits (RLAs) for multiwinner single transferable vote (STV) elections is a challenging problem. An STV RLA is designed to statistically verify that the reported winners of an election did indeed win according to the voters' expressed preferences and not due to mistabulation or interference, while limiting the risk of accepting an incorrect outcome to a desired threshold (the risk limit). Existing methods have shown that it is possible to form RLAs for two-seat STV elections in the context where the first seat has been awarded to a candidate in the first round of tabulation. This is called the first winner criterion. We present an assertion-based approach to conducting full or partial RLAs for STV elections with three or more seats, in which the first winner criterion is satisfied. Although the chance of forming a full audit that verifies all winners drops substantially as the number of seats increases, we show that we can quite often form partial audits that verify most, and sometimes all, of the reported winners. We evaluate our method on a dataset of over 500 three- and four-seat STV elections from the 2017 and 2022 local council elections in Scotland.

CVFeb 1, 2025Code
NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning

Zhixi Cai, Fucai Ke, Simindokht Jahangard et al.

Visual Grounding (VG) tasks, such as referring expression detection and segmentation tasks are important for linking visual entities to context, especially in complex reasoning tasks that require detailed query interpretation. This paper explores VG beyond basic perception, highlighting challenges for methods that require reasoning like human cognition. Recent advances in large language methods (LLMs) and Vision-Language methods (VLMs) have improved abilities for visual comprehension, contextual understanding, and reasoning. These methods are mainly split into end-to-end and compositional methods, with the latter offering more flexibility. Compositional approaches that integrate LLMs and foundation models show promising performance but still struggle with complex reasoning with language-based logical representations. To address these limitations, we propose NAVER, a compositional visual grounding method that integrates explicit probabilistic logic reasoning within a finite-state automaton, equipped with a self-correcting mechanism. This design improves robustness and interpretability in inference through explicit logic reasoning. Our results show that NAVER achieves SoTA performance comparing to recent end-to-end and compositional baselines. The code is available at https://github.com/ControlNet/NAVER .

11.3CYMar 26
Doing More With Less: Mismatch-Based Risk-Limiting Audits

Alexander Ek, Michelle Blom, Philip B. Stark et al.

One approach to risk-limiting audits (RLAs) compares randomly selected cast vote records (CVRs) to votes read by human auditors from the corresponding ballot cards. Historically, such methods reduce audit sample sizes by considering how each sampled CVR differs from the corresponding true vote, not merely whether they differ. Here we investigate the latter approach, auditing by testing whether the total number of mismatches in the full set of CVRs exceeds the minimum number of CVR errors required for the reported outcome to be wrong (the "CVR margin"). This strategy makes it possible to audit more social choice functions and simplifies RLAs conceptually, which makes it easier to explain than some other RLA approaches. The cost is larger sample sizes. "Mismatch-based RLAs" only require a lower bound on the CVR margin, which for some social choice functions is easier to calculate than the effect of particular errors. When the population rate of mismatches is low and the lower bound on the CVR margin is close to the true CVR margin, the increase in sample size is small. However, the increase may be very large when errors include errors that, if corrected, would widen the CVR margin rather than narrow it; errors affect the margin between candidates other than the reported winner with the fewest votes and the reported loser with the most votes; or errors that affect different margins.

FLFeb 12
A Formal Framework for the Explanation of Finite Automata Decisions

Jaime Cuartas Granada, Alexey Ignatiev, Peter J. Stuckey

Finite automata (FA) are a fundamental computational abstraction that is widely used in practice for various tasks in computer science, linguistics, biology, electrical engineering, and artificial intelligence. Given an input word, an FA maps the word to a result, in the simple case "accept" or "reject", but in general to one of a finite set of results. A question that then arises is: why? Another question is: how can we modify the input word so that it is no longer accepted? One may think that the automaton itself is an adequate explanation of its behaviour, but automata can be very complex and difficult to make sense of directly. In this work, we investigate how to explain the behaviour of an FA on an input word in terms of the word's characters. In particular, we are interested in minimal explanations: what is the minimal set of input characters that explains the result, and what are the minimal changes needed to alter the result? In this paper, we propose an efficient method to determine all minimal explanations for the behaviour of an FA on a particular word. This allows us to give unbiased explanations about which input features are responsible for the result. Experiments show that our approach scales well, even when the underlying problem is challenging.

CYApr 1, 2020Code
You can do RLAs for IRV

Michelle Blom, Andrew Conway, Dan King et al.

The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District Attorney's race in November, 2019. We found that the vote-by-mail outcome could be efficiently audited to well under the 0.05 risk limit given a sample of only 200 ballots. All the software we developed for the pilot is open source.

AIDec 12, 2023
Anytime Approximate Formal Feature Attribution

Jinqiang Yu, Graham Farr, Alexey Ignatiev et al.

Widespread use of artificial intelligence (AI) algorithms and machine learning (ML) models on the one hand and a number of crucial issues pertaining to them warrant the need for explainable artificial intelligence (XAI). A key explainability question is: given this decision was made, what are the input features which contributed to the decision? Although a range of XAI approaches exist to tackle this problem, most of them have significant limitations. Heuristic XAI approaches suffer from the lack of quality guarantees, and often try to approximate Shapley values, which is not the same as explaining which features contribute to a decision. A recent alternative is so-called formal feature attribution (FFA), which defines feature importance as the fraction of formal abductive explanations (AXp's) containing the given feature. This measures feature importance from the view of formally reasoning about the model's behavior. It is challenging to compute FFA using its definition because that involves counting AXp's, although one can approximate it. Based on these results, this paper makes several contributions. First, it gives compelling evidence that computing FFA is intractable, even if the set of contrastive formal explanations (CXp's) is provided, by proving that the problem is #P-hard. Second, by using the duality between AXp's and CXp's, it proposes an efficient heuristic to switch from CXp enumeration to AXp enumeration on-the-fly resulting in an adaptive explanation enumeration algorithm effectively approximating FFA in an anytime fashion. Finally, experimental results obtained on a range of widely used datasets demonstrate the effectiveness of the proposed FFA approximation approach in terms of the error of FFA approximation as well as the number of explanations computed and their diversity given a fixed time limit.

AIOct 18, 2024
Formal Explanations for Neuro-Symbolic AI

Sushmita Paul, Jinqiang Yu, Jip J. Dekker et al.

Despite the practical success of Artificial Intelligence (AI), current neural AI algorithms face two significant issues. First, the decisions made by neural architectures are often prone to bias and brittleness. Second, when a chain of reasoning is required, neural systems often perform poorly. Neuro-symbolic artificial intelligence is a promising approach that tackles these (and other) weaknesses by combining the power of neural perception and symbolic reasoning. Meanwhile, the success of AI has made it critical to understand its behaviour, leading to the development of explainable artificial intelligence (XAI). While neuro-symbolic AI systems have important advantages over purely neural AI, we still need to explain their actions, which are obscured by the interactions of the neural and symbolic components. To address the issue, this paper proposes a formal approach to explaining the decisions of neuro-symbolic systems. The approach hinges on the use of formal abductive explanations and on solving the neuro-symbolic explainability problem hierarchically. Namely, it first computes a formal explanation for the symbolic component of the system, which serves to identify a subset of the individual parts of neural information that needs to be explained. This is followed by explaining only those individual neural inputs, independently of each other, which facilitates succinctness of hierarchical formal explanations and helps to increase the overall performance of the approach. Experimental results for a few complex reasoning tasks demonstrate practical efficiency of the proposed approach, in comparison to purely neural systems, from the perspective of explanation size, explanation time, training time, model sizes, and the quality of explanations reported.

DSOct 21, 2024
Timetable Nodes for Public Transport Network

Andrii Rohovyi, Peter J. Stuckey, Toby Walsh

Faster pathfinding in time-dependent transport networks is an important and challenging problem in navigation systems. There are two main types of transport networks: road networks for car driving and public transport route network. The solutions that work well in road networks, such as Time-dependent Contraction Hierarchies and other graph-based approaches, do not usually apply in transport networks. In transport networks, non-graph solutions such as CSA and RAPTOR show the best results compared to graph-based techniques. In our work, we propose a method that advances graph-based approaches by using different optimization techniques from computational geometry to speed up the search process in transport networks. We apply a new pre-computation step, which we call timetable nodes (TTN). Our inspiration comes from an iterative search problem in computational geometry. We implement two versions of the TTN: one uses a Combined Search Tree (TTN-CST), and the second uses Fractional Cascading (TTN-FC). Both of these approaches decrease the asymptotic complexity of reaching new nodes from $O(k\times \log|C|)$ to $O(k + \log(k) + \log(|C|))$, where $k$ is the number of outgoing edges from a node and $|C|$ is the size of the timetable information (total outgoing edges). Our solution suits any other time-dependent networks and can be integrated into other pathfinding algorithms. Our experiments indicate that this pre-computation significantly enhances the performance on high-density graphs. This study showcases how leveraging computational geometry can enhance pathfinding in transport networks, enabling faster pathfinding in scenarios involving large numbers of outgoing edges.

AIMay 15, 2023
Tracking Progress in Multi-Agent Path Finding

Bojie Shen, Zhe Chen, Muhammad Aamir Cheema et al.

Multi-Agent Path Finding (MAPF) is an important core problem for many new and emerging industrial applications. Many works appear on this topic each year, and a large number of substantial advancements and performance improvements have been reported. Yet measuring overall progress in MAPF is difficult: there are many potential competitors, and the computational burden for comprehensive experimentation is prohibitively large. Moreover, detailed data from past experimentation is usually unavailable. In this work, we introduce a set of methodological and visualisation tools which can help the community establish clear indicators for state-of-the-art MAPF performance and which can facilitate large-scale comparisons between MAPF solvers. Our objectives are to lower the barrier of entry for new researchers and to further promote the study of MAPF, since progress in the area and the main challenges are made much clearer.

CYDec 18, 2021
A First Approach to Risk-Limiting Audits for Single Transferable Vote Elections

Michelle Blom, Peter J. Stuckey, Vanessa Teague et al.

Risk-limiting audits (RLAs) are an increasingly important method for checking that the reported outcome of an election is, in fact, correct. Indeed, their use is increasingly being legislated. While effective methods for RLAs have been developed for many forms of election -- for example: first-past-the-post, instant-runoff voting, and D'Hondt elections -- auditing methods for single transferable vote (STV) elections have yet to be developed. STV elections are notoriously hard to reason about since there is a complex interaction of votes that change their value throughout the process. In this paper we present the first approach to risk-limiting audits for STV elections, restricted to the case of 2-seat STV elections.

MAOct 28, 2021
Integrated Task Assignment and Path Planning for Capacitated Multi-Agent Pickup and Delivery

Zhe Chen, Javier Alonso-Mora, Xiaoshan Bai et al.

Multi-agent Pickup and Delivery (MAPD) is a challenging industrial problem where a team of robots is tasked with transporting a set of tasks, each from an initial location and each to a specified target location. Appearing in the context of automated warehouse logistics and automated mail sortation, MAPD requires first deciding which robot is assigned what task (i.e., Task Assignment or TA) followed by a subsequent coordination problem where each robot must be assigned collision-free paths so as to successfully complete its assignment (i.e., Multi-Agent Path Finding or MAPF). Leading methods in this area solve MAPD sequentially: first assigning tasks, then assigning paths. In this work we propose a new coupled method where task assignment choices are informed by actual delivery costs instead of by lower-bound estimates. The main ingredients of our approach are a marginal-cost assignment heuristic and a meta-heuristic improvement strategy based on Large Neighbourhood Search. As a further contribution, we also consider a variant of the MAPD problem where each robot can carry multiple tasks instead of just one. Numerical simulations show that our approach yields efficient and timely solutions and we report significant improvement compared with other recent methods from the literature.

AIAug 2, 2021
Planning with Learned Binarized Neural Networks Benchmarks for MaxSAT Evaluation 2021

Buser Say, Scott Sanner, Jo Devriendt et al.

This document provides a brief introduction to learned automated planning problem where the state transition function is in the form of a binarized neural network (BNN), presents a general MaxSAT encoding for this problem, and describes the four domains, namely: Navigation, Inventory Control, System Administrator and Cellda, that are submitted as benchmarks for MaxSAT Evaluation 2021.

CYJul 25, 2021
Assertion-Based Approaches to Auditing Complex Elections, with Application to Party-List Proportional Elections

Michelle Blom, Jurlind Budurushi, Ronald L. Rivest et al.

Risk-limiting audits (RLAs), an ingredient in evidence-based elections, are increasingly common. They are a rigorous statistical means of ensuring that electoral results are correct, usually without having to perform an expensive full recount -- at the cost of some controlled probability of error. A recently developed approach for conducting RLAs, SHANGRLA, provides a flexible framework that can encompass a wide variety of social choice functions and audit strategies. Its flexibility comes from reducing sufficient conditions for outcomes to be correct to canonical `assertions' that have a simple mathematical form. Assertions have been developed for auditing various social choice functions including plurality, multi-winner plurality, super-majority, Hamiltonian methods, and instant runoff voting. However, there is no systematic approach to building assertions. Here, we show that assertions with linear dependence on transformations of the votes can easily be transformed to canonical form for SHANGRLA. We illustrate the approach by constructing assertions for party-list elections such as Hamiltonian free list elections and elections using the D'Hondt method, expanding the set of social choice functions to which SHANGRLA applies directly.

AIMar 12, 2021
Pairwise Symmetry Reasoning for Multi-Agent Path Finding Search

Jiaoyang Li, Daniel Harabor, Peter J. Stuckey et al.

Multi-Agent Path Finding (MAPF) is a challenging combinatorial problem that asks us to plan collision-free paths for a team of cooperative agents. In this work, we show that one of the reasons why MAPF is so hard to solve is due to a phenomenon called pairwise symmetry, which occurs when two agents have many different paths to their target locations, all of which appear promising, but every combination of them results in a collision. We identify several classes of pairwise symmetries and show that each one arises commonly in practice and can produce an exponential explosion in the space of possible collision resolutions, leading to unacceptable runtimes for current state-of-the-art (bounded-sub)optimal MAPF algorithms. We propose a variety of reasoning techniques that detect the symmetries efficiently as they arise and resolve them by using specialized constraints to eliminate all permutations of pairwise colliding paths in a single branching step. We implement these ideas in the context of the leading optimal MAPF algorithm CBS and show that the addition of the symmetry reasoning techniques can have a dramatic positive effect on its performance - we report a reduction in the number of node expansions by up to four orders of magnitude and an increase in scalability by up to thirty times. These gains allow us to solve to optimality a variety of challenging MAPF instances previously considered out of reach for CBS.

AIFeb 17, 2021
Symmetry Breaking for k-Robust Multi-Agent Path Finding

Zhe Chen, Daniel Harabor, Jiaoyang Li et al.

During Multi-Agent Path Finding (MAPF) problems, agents can be delayed by unexpected events. To address such situations recent work describes k-Robust Conflict-BasedSearch (k-CBS): an algorithm that produces coordinated and collision-free plan that is robust for up to k delays. In this work we introducing a variety of pairwise symmetry breaking constraints, specific to k-robust planning, that can efficiently find compatible and optimal paths for pairs of conflicting agents. We give a thorough description of the new constraints and report large improvements to success rate ina range of domains including: (i) classic MAPF benchmarks;(ii) automated warehouse domains and; (iii) on maps from the 2019 Flatland Challenge, a recently introduced railway domain where k-robust planning can be fruitfully applied to schedule trains.

CYFeb 17, 2021
Auditing Hamiltonian Elections

Michelle Blom, Philip B. Stark, Peter J. Stuckey et al.

Presidential primaries are a critical part of the United States Presidential electoral process, since they are used to select the candidates in the Presidential election. While methods differ by state and party, many primaries involve proportional delegate allocation using the so-called Hamilton method. In this paper we show how to conduct risk-limiting audits for delegate allocation elections using variants of the Hamilton method where the viability of candidates is determined either by a plurality vote or using instant runoff voting. Experiments on real-world elections show that we can audit primary elections to high confidence (small risk limits) usually at low cost.

AIFeb 3, 2021
A Scalable Two Stage Approach to Computing Optimal Decision Sets

Alexey Ignatiev, Edward Lam, Peter J. Stuckey et al.

Machine learning (ML) is ubiquitous in modern life. Since it is being deployed in technologies that affect our privacy and safety, it is often crucial to understand the reasoning behind its decisions, warranting the need for explainable AI. Rule-based models, such as decision trees, decision lists, and decision sets, are conventionally deemed to be the most interpretable. Recent work uses propositional satisfiability (SAT) solving (and its optimization variants) to generate minimum-size decision sets. Motivated by limited practical scalability of these earlier methods, this paper proposes a novel approach to learn minimum-size decision sets by enumerating individual rules of the target decision set independently of each other, and then solving a set cover problem to select a subset of rules. The approach makes use of modern maximum satisfiability and integer linear programming technologies. Experiments on a wide range of publicly available datasets demonstrate the advantage of the new approach over the state of the art in SAT-based decision set learning.

LGDec 4, 2020
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize

Ali Ugur Guler, Emir Demirovic, Jeffrey Chan et al.

The predict+optimize problem combines machine learning ofproblem coefficients with a combinatorial optimization prob-lem that uses the predicted coefficients. While this problemcan be solved in two separate stages, it is better to directlyminimize the optimization loss. However, this requires dif-ferentiating through a discrete, non-differentiable combina-torial function. Most existing approaches use some form ofsurrogate gradient. Demirovicet alshowed how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. How-ever, their approach is restricted to optimization problemswith a dynamic programming formulation. In this work wepropose a novel divide and conquer algorithm to tackle op-timization problems without this restriction and predict itscoefficients using the optimization loss. We also introduce agreedy version of this approach, which achieves similar re-sults with less computation. We compare our approach withother approaches to the predict+optimize problem and showwe can successfully tackle some hard combinatorial problemsbetter than other predict+optimize methods.

AIOct 19, 2020
Optimal Decision Lists using SAT

Jinqiang Yu, Alexey Ignatiev, Pierre Le Bodic et al.

Decision lists are one of the most easily explainable machine learning models. Given the renewed emphasis on explainable machine learning decisions, this machine learning model is increasingly attractive, combining small size and clear explainability. In this paper, we show for the first time how to construct optimal "perfect" decision lists which are perfectly accurate on the training data, and minimal in size, making use of modern SAT solving technology. We also give a new method for determining optimal sparse decision lists, which trade off size and accuracy. We contrast the size and test accuracy of optimal decisions lists versus optimal decision sets, as well as other state-of-the-art methods for determining optimal decision lists. We also examine the size of average explanations generated by decision sets and decision lists.

LGSep 15, 2020
Optimal Decision Trees for Nonlinear Metrics

Emir Demirović, Peter J. Stuckey

Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more samples of one class than the other. Recent optimal decision tree algorithms have shown remarkable progress in producing trees that are optimal with respect to linear criteria, such as accuracy, but unfortunately nonlinear metrics remain a challenge. To address this gap, we propose a novel algorithm based on bi-objective optimisation, which treats misclassifications of each binary class as a separate objective. We show that, for a large class of metrics, the optimal tree lies on the Pareto frontier. Consequently, we obtain the optimal tree by using our method to generate the set of all nondominated trees. To the best of our knowledge, this is the first method to compute provably optimal decision trees for nonlinear metrics. Our approach leads to a trade-off when compared to optimising linear metrics: the resulting trees may be more desirable according to the given nonlinear metric at the expense of higher runtimes. Nevertheless, the experiments illustrate that runtimes are reasonable for majority of the tested datasets.

AIJul 29, 2020
Computing Optimal Decision Sets with SAT

Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey et al.

As machine learning is increasingly used to help make decisions, there is a demand for these decisions to be explainable. Arguably, the most explainable machine learning models use decision rules. This paper focuses on decision sets, a type of model with unordered rules, which explains each prediction with a single rule. In order to be easy for humans to understand, these rules must be concise. Earlier work on generating optimal decision sets first minimizes the number of rules, and then minimizes the number of literals, but the resulting rules can often be very large. Here we consider a better measure, namely the total size of the decision set in terms of literals. So we are not driven to a small set of rules which require a large number of literals. We provide the first approach to determine minimum-size decision sets that achieve minimum empirical risk and then investigate sparse alternatives where we trade accuracy for size. By finding optimal solutions we show we can build decision set classifiers that are almost as accurate as the best heuristic methods, but far more concise, and hence more explainable.

LGJul 24, 2020
MurTree: Optimal Classification Trees via Dynamic Programming and Search

Emir Demirović, Anna Lukina, Emmanuel Hebrard et al.

Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees.

AIDec 21, 2018
Solution Dominance over Constraint Satisfaction Problems

Tias Guns, Peter J. Stuckey, Guido Tack

Constraint Satisfaction Problems (CSPs) typically have many solutions that satisfy all constraints. Often though, some solutions are preferred over others, that is, some solutions dominate other solutions. We present solution dominance as a formal framework to reason about such settings. We define Constraint Dominance Problems (CDPs) as CSPs with a dominance relation, that is, a preorder over the solutions of the CSP. This framework captures many well-known variants of constraint satisfaction, including optimization, multi-objective optimization, Max-CSP, minimal models, minimum correction subsets as well as optimization over CP-nets and arbitrary dominance relations. We extend MiniZinc, a declarative language for modeling CSPs, to CDPs by introducing dominance nogoods; these can be derived from dominance relations in a principled way. A generic method for solving arbitrary CDPs incrementally calls a CSP solver and is compatible with any existing solver that supports MiniZinc. This encourages experimenting with different solution dominance relations for a problem, as well as comparing different solvers without having to modify their implementations.

AIDec 15, 2018
Searching with Consistent Prioritization for Multi-Agent Path Finding

Hang Ma, Daniel Harabor, Peter J. Stuckey et al.

We study prioritized planning for Multi-Agent Path Finding (MAPF). Existing prioritized MAPF algorithms depend on rule-of-thumb heuristics and random assignment to determine a fixed total priority ordering of all agents a priori. We instead explore the space of all possible partial priority orderings as part of a novel systematic and conflict-driven combinatorial search framework. In a variety of empirical comparisons, we demonstrate state-of-the-art solution qualities and success rates, often with similar runtimes to existing algorithms. We also develop new theoretical results that explore the limitations of prioritized planning, in terms of completeness and optimality, for the first time.

LOAug 25, 2015
Unsatisfiable Cores and Lower Bounding for Constraint Programming

Nicholas Downing, Thibaut Feydy, Peter J. Stuckey

Constraint Programming (CP) solvers typically tackle optimization problems by repeatedly finding solutions to a problem while placing tighter and tighter bounds on the solution cost. This approach is somewhat naive, especially for soft-constraint optimization problems in which the soft constraints are mostly satisfied. Unsatisfiable-core approaches to solving soft constraint problems in SAT (e.g. MAXSAT) force all soft constraints to be hard initially. When solving fails they return an unsatisfiable core, as a set of soft constraints that cannot hold simultaneously. These are reverted to soft and solving continues. Since lazy clause generation solvers can also return unsatisfiable cores we can adapt this approach to constraint programming. We adapt the original MAXSAT unsatisfiable core solving approach to be usable for constraint programming and define a number of extensions. Experimental results show that our methods are beneficial on a broad class of CP-optimization benchmarks involving soft constraints, cardinality or preferences.

AIAug 20, 2015
Efficient Computation of Exact IRV Margins

Michelle Blom, Peter J. Stuckey, Vanessa J. Teague et al.

The margin of victory is easy to compute for many election schemes but difficult for Instant Runoff Voting (IRV). This is important because arguments about the correctness of an election outcome usually rely on the size of the electoral margin. For example, risk-limiting audits require a knowledge of the margin of victory in order to determine how much auditing is necessary. This paper presents a practical branch-and-bound algorithm for exact IRV margin computation that substantially improves on the current best-known approach. Although exponential in the worst case, our algorithm runs efficiently in practice on all the real examples we could find. We can efficiently discover exact margins on election instances that cannot be solved by the current state-of-the-art.

AIJun 19, 2013
Structure Based Extended Resolution for Constraint Programming

Geoffrey Chu, Peter J. Stuckey

Nogood learning is a powerful approach to reducing search in Constraint Programming (CP) solvers. The current state of the art, called Lazy Clause Generation (LCG), uses resolution to derive nogoods expressing the reasons for each search failure. Such nogoods can prune other parts of the search tree, producing exponential speedups on a wide variety of problems. Nogood learning solvers can be seen as resolution proof systems. The stronger the proof system, the faster it can solve a CP problem. It has recently been shown that the proof system used in LCG is at least as strong as general resolution. However, stronger proof systems such as \emph{extended resolution} exist. Extended resolution allows for literals expressing arbitrary logical concepts over existing variables to be introduced and can allow exponentially smaller proofs than general resolution. The primary problem in using extended resolution is to figure out exactly which literals are useful to introduce. In this paper, we show that we can use the structural information contained in a CP model in order to introduce useful literals, and that this can translate into significant speedups on a range of problems.

LOMay 8, 2013
Unsatisfiable Cores for Constraint Programming

Nicholas Downing, Thibaut Feydy, Peter J. Stuckey

Constraint Programming (CP) solvers typically tackle optimization problems by repeatedly finding solutions to a problem while placing tighter and tighter bounds on the solution cost. This approach is somewhat naive, especially for soft-constraint optimization problems in which the soft constraints are mostly satisfied. Unsatisfiable-core approaches to solving soft constraint problems in Boolean Satisfiability (e.g. MAXSAT) force all soft constraints to hold initially. When solving fails they return an unsatisfiable core, as a set of soft constraints that cannot hold simultaneously. Using this information the problem is relaxed to allow certain soft constraint(s) to be violated and solving continues. Since Lazy Clause Generation (LCG) solvers can also return unsatisfiable cores we can adapt the MAXSAT unsatisfiable core approach to CP. We implement the original MAXSAT unsatisfiable core solving algorithms WPM1, MSU3 in a state-of-the-art LCG solver and show that there exist problems which benefit from this hybrid approach.

AIAug 15, 2012
Explaining Time-Table-Edge-Finding Propagation for the Cumulative Resource Constraint

Andreas Schutt, Thibaut Feydy, Peter J. Stuckey

Cumulative resource constraints can model scarce resources in scheduling problems or a dimension in packing and cutting problems. In order to efficiently solve such problems with a constraint programming solver, it is important to have strong and fast propagators for cumulative resource constraints. One such propagator is the recently developed time-table-edge-finding propagator, which considers the current resource profile during the edge-finding propagation. Recently, lazy clause generation solvers, i.e. constraint programming solvers incorporating nogood learning, have proved to be excellent at solving scheduling and cutting problems. For such solvers, concise and accurate explanations of the reasons for propagation are essential for strong nogood learning. In this paper, we develop the first explaining version of time-table-edge-finding propagation and show preliminary results on resource-constrained project scheduling problems from various standard benchmark suites. On the standard benchmark suite PSPLib, we were able to close one open instance and to improve the lower bound of about 60% of the remaining open instances. Moreover, 6 of those instances were closed.

AIMar 6, 2012
Search Combinators

Tom Schrijvers, Guido Tack, Pieter Wuille et al.

The ability to model search in a constraint solver can be an essential asset for solving combinatorial problems. However, existing infrastructure for defining search heuristics is often inadequate. Either modeling capabilities are extremely limited or users are faced with a general-purpose programming language whose features are not tailored towards writing search heuristics. As a result, major improvements in performance may remain unexplored. This article introduces search combinators, a lightweight and solver-independent method that bridges the gap between a conceptually simple modeling language for search (high-level, functional and naturally compositional) and an efficient implementation (low-level, imperative and highly non-modular). By allowing the user to define application-tailored search strategies from a small set of primitives, search combinators effectively provide a rich domain-specific language (DSL) for modeling search to the user. Remarkably, this DSL comes at a low implementation cost to the developer of a constraint solver. The article discusses two modular implementation approaches and shows, by empirical evaluation, that search combinators can be implemented without overhead compared to a native, direct implementation in a constraint solver.