AIMay 20, 2022
On Tackling Explanation Redundancy in Decision TreesYacine Izza, Alexey Ignatiev, Joao Marques-Silva
Decision trees (DTs) epitomize the ideal of interpretability of machine learning (ML) models. The interpretability of decision trees motivates explainability approaches by so-called intrinsic interpretability, and it is at the core of recent proposals for applying interpretable ML models in high-risk applications. The belief in DT interpretability is justified by the fact that explanations for DT predictions are generally expected to be succinct. Indeed, in the case of DTs, explanations correspond to DT paths. Since decision trees are ideally shallow, and so paths contain far fewer features than the total number of features, explanations in DTs are expected to be succinct, and hence interpretable. This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not be deemed interpretable. The paper introduces logically rigorous path explanations and path explanation redundancy, and proves that there exist functions for which decision trees must exhibit paths with arbitrarily large explanation redundancy. The paper also proves that only a very restricted class of functions can be represented with DTs that exhibit no explanation redundancy. In addition, the paper includes experimental results substantiating that path explanation redundancy is observed ubiquitously in decision trees, including those obtained using different tree learning algorithms, but also in a wide range of publicly available decision trees. The paper also proposes polynomial-time algorithms for eliminating path explanation redundancy, which in practice require negligible time to compute. Thus, these algorithms serve to indirectly attain irreducible, and so succinct, explanations for decision trees.
AIOct 24, 2022
Logic-Based Explainability in Machine LearningJoao Marques-Silva
The last decade witnessed an ever-increasing stream of successes in Machine Learning (ML). These successes offer clear evidence that ML is bound to become pervasive in a wide range of practical uses, including many that directly affect humans. Unfortunately, the operation of the most successful ML models is incomprehensible for human decision makers. As a result, the use of ML models, especially in high-risk and safety-critical settings is not without concern. In recent years, there have been efforts on devising approaches for explaining ML models. Most of these efforts have focused on so-called model-agnostic approaches. However, all model-agnostic and related approaches offer no guarantees of rigor, hence being referred to as non-formal. For example, such non-formal explanations can be consistent with different predictions, which renders them useless in practice. This paper overviews the ongoing research efforts on computing rigorous model-based explanations of ML models; these being referred to as formal explanations. These efforts encompass a variety of topics, that include the actual definitions of explanations, the characterization of the complexity of computing explanations, the currently best logical encodings for reasoning about different ML models, and also how to make explanations interpretable for human decision makers, among others.
LGFeb 16, 2023
The Inadequacy of Shapley Values for ExplainabilityXuanxiang Huang, Joao Marques-Silva
This paper develops a rigorous argument for why the use of Shapley values in explainable AI (XAI) will necessarily yield provably misleading information about the relative importance of features for predictions. Concretely, this paper demonstrates that there exist classifiers, and associated predictions, for which the relative importance of features determined by the Shapley values will incorrectly assign more importance to features that are provably irrelevant for the prediction, and less importance to features that are provably relevant for the prediction. The paper also argues that, given recent complexity results, the existence of efficient algorithms for the computation of rigorous feature attribution values in the case of some restricted classes of classifiers should be deemed unlikely at best.
AIJun 27, 2023
Explainability is NOT a GameJoao Marques-Silva, Xuanxiang Huang
Explainable artificial intelligence (XAI) aims to help human decision-makers in understanding complex machine learning (ML) models. One of the hallmarks of XAI are measures of relative feature importance, which are theoretically justified through the use of Shapley values. This paper builds on recent work and offers a simple argument for why Shapley values can provide misleading measures of relative feature importance, by assigning more importance to features that are irrelevant for a prediction, and assigning less importance to features that are relevant for a prediction. The significance of these results is that they effectively challenge the many proposed uses of measures of relative feature importance in a fast-growing range of high-stakes application domains.
AIJun 20, 2022
Eliminating The Impossible, Whatever Remains Must Be TrueJinqiang Yu, Alexey Ignatiev, Peter J. Stuckey et al.
The rise of AI methods to make predictions and decisions has led to a pressing need for more explainable artificial intelligence (XAI) methods. One common approach for XAI is to produce a post-hoc explanation, explaining why a black box ML model made a certain prediction. Formal approaches to post-hoc explanations provide succinct reasons for why a prediction was made, as well as why not another prediction was made. But these approaches assume that features are independent and uniformly distributed. While this means that "why" explanations are correct, they may be longer than required. It also means the "why not" explanations may be suspect as the counterexamples they rely on may not be meaningful. In this paper, we show how one can apply background knowledge to give more succinct "why" formal explanations, that are presumably easier to interpret by humans, and give more accurate "why not" explanations. In addition, we show how to use existing rule induction techniques to efficiently extract background information from a dataset, and also how to report which background information was used to make an explanation, allowing a human to examine it if they doubt the correctness of the explanation.
AIDec 12, 2022
On Computing Probabilistic Abductive ExplanationsYacine Izza, Xuanxiang Huang, Alexey Ignatiev et al.
The most widely studied explainable AI (XAI) approaches are unsound. This is the case with well-known model-agnostic explanation approaches, and it is also the case with approaches based on saliency maps. One solution is to consider intrinsic interpretability, which does not exhibit the drawback of unsoundness. Unfortunately, intrinsic interpretability can display unwieldy explanation redundancy. Formal explainability represents the alternative to these non-rigorous approaches, with one example being PI-explanations. Unfortunately, PI-explanations also exhibit important drawbacks, the most visible of which is arguably their size. Recently, it has been observed that the (absolute) rigor of PI-explanations can be traded off for a smaller explanation size, by computing the so-called relevant sets. Given some positive δ, a set S of features is δ-relevant if, when the features in S are fixed, the probability of getting the target class exceeds δ. However, even for very simple classifiers, the complexity of computing relevant sets of features is prohibitive, with the decision problem being NPPP-complete for circuit-based classifiers. In contrast with earlier negative results, this paper investigates practical approaches for computing relevant sets for a number of widely used classifiers that include Decision Trees (DTs), Naive Bayes Classifiers (NBCs), and several families of classifiers obtained from propositional languages. Moreover, the paper shows that, in practice, and for these families of classifiers, relevant sets are easy to compute. Furthermore, the experiments confirm that succinct sets of relevant features can be obtained for the families of classifiers considered.
LGOct 27, 2022
Feature Necessity & Relevancy in ML Classifier ExplanationsXuanxiang Huang, Martin C. Cooper, Antonio Morgado et al.
Given a machine learning (ML) model and a prediction, explanations can be defined as sets of features which are sufficient for the prediction. In some applications, and besides asking for an explanation, it is also critical to understand whether sensitive features can occur in some explanation, or whether a non-interesting feature must occur in all explanations. This paper starts by relating such queries respectively with the problems of relevancy and necessity in logic-based abduction. The paper then proves membership and hardness results for several families of ML classifiers. Afterwards the paper proposes concrete algorithms for two classes of classifiers. The experimental results confirm the scalability of the proposed algorithms.
AIJun 27, 2023
Delivering Inflated ExplanationsYacine Izza, Alexey Ignatiev, Peter Stuckey et al.
In the quest for Explainable Artificial Intelligence (XAI) one of the questions that frequently arises given a decision made by an AI system is, ``why was the decision made in this way?'' Formal approaches to explainability build a formal model of the AI system and use this to reason about the properties of the system. Given a set of feature values for an instance to be explained, and a resulting decision, a formal abductive explanation is a set of features, such that if they take the given value will always lead to the same decision. This explanation is useful, it shows that only some features were used in making the final decision. But it is narrow, it only shows that if the selected features take their given values the decision is unchanged. It's possible that some features may change values and still lead to the same decision. In this paper we formally define inflated explanations which is a set of features, and for each feature of set of values (always including the value of the instance being explained), such that the decision will remain unchanged. Inflated explanations are more informative than abductive explanations since e.g they allow us to see if the exact value of a feature is important, or it could be any nearby value. Overall they allow us to better understand the role of each feature in the decision. We show that we can compute inflated explanations for not that much greater cost than abductive explanations, and that we can extend duality results for abductive explanations also to inflated explanations.
AIJun 5, 2023
From Robustness to Explainability and Back AgainXuanxiang Huang, Joao Marques-Silva
Formal explainability guarantees the rigor of computed explanations, and so it is paramount in domains where rigor is critical, including those deemed high-risk. Unfortunately, since its inception formal explainability has been hampered by poor scalability. At present, this limitation still holds true for some families of classifiers, the most significant being deep neural networks. This paper addresses the poor scalability of formal explainability and proposes novel efficient algorithms for computing formal explanations. The novel algorithm computes explanations by answering instead a number of robustness queries, and such that the number of such queries is at most linear on the number of features. Consequently, the proposed algorithm establishes a direct relationship between the practical complexity of formal explainability and that of robustness. To achieve the proposed goals, the paper generalizes the definition of formal explanations, thereby allowing the use of robustness tools that are based on different distance norms, and also by reasoning in terms of some target degree of robustness. Preliminary experiments validate the practical efficiency of the proposed approach.
AIMay 19, 2022
Provably Precise, Succinct and Efficient Explanations for Decision TreesYacine Izza, Alexey Ignatiev, Nina Narodytska et al.
Decision trees (DTs) embody interpretable classifiers. DTs have been advocated for deployment in high-risk applications, but also for explaining other complex classifiers. Nevertheless, recent work has demonstrated that predictions in DTs ought to be explained with rigorous approaches. Although rigorous explanations can be computed in polynomial time for DTs, their size may be beyond the cognitive limits of human decision makers. This paper investigates the computation of δ-relevant sets for DTs. δ-relevant sets denote explanations that are succinct and provably precise. These sets represent generalizations of rigorous explanations, which are precise with probability one, and so they enable trading off explanation size for precision. The paper proposes two logic encodings for computing smallest δ-relevant sets for DTs. The paper further devises a polynomial-time algorithm for computing δ-relevant sets which are not guaranteed to be subset-minimal, but for which the experiments show to be most often subset-minimal in practice. The experimental results also demonstrate the practical efficiency of computing smallest δ-relevant sets.
LGJul 11, 2022
On Computing Relevant Features for Explaining NBCsYacine Izza, Joao Marques-Silva
Despite the progress observed with model-agnostic explainable AI (XAI), it is the case that model-agnostic XAI can produce incorrect explanations. One alternative are the so-called formal approaches to XAI, that include PI-explanations. Unfortunately, PI-explanations also exhibit important drawbacks, the most visible of which is arguably their size. The computation of relevant features serves to trade off probabilistic precision for the number of features in an explanation. However, even for very simple classifiers, the complexity of computing sets of relevant features is prohibitive. This paper investigates the computation of relevant sets for Naive Bayes Classifiers (NBCs), and shows that, in practice, these are easy to compute. Furthermore, the experiments confirm that succinct sets of relevant features can be obtained with NBCs.
AISep 6, 2023
A Refutation of Shapley Values for ExplainabilityXuanxiang Huang, Joao Marques-Silva
Recent work demonstrated the existence of Boolean functions for which Shapley values provide misleading information about the relative importance of features in rule-based explanations. Such misleading information was broadly categorized into a number of possible issues. Each of those issues relates with features being relevant or irrelevant for a prediction, and all are significant regarding the inadequacy of Shapley values for rule-based explainability. This earlier work devised a brute-force approach to identify Boolean functions, defined on small numbers of features, and also associated instances, which displayed such inadequacy-revealing issues, and so served as evidence to the inadequacy of Shapley values for rule-based explainability. However, an outstanding question is how frequently such inadequacy-revealing issues can occur for Boolean functions with arbitrary large numbers of features. It is plain that a brute-force approach would be unlikely to provide insights on how to tackle this question. This paper answers the above question by proving that, for any number of features, there exist Boolean functions that exhibit one or more inadequacy-revealing issues, thereby contributing decisive arguments against the use of Shapley values as the theoretical underpinning of feature-attribution methods in explainability.
AIJun 27, 2023
On Logic-Based Explainability with Partially Specified InputsRamón Béjar, António Morgado, Jordi Planes et al.
In the practical deployment of machine learning (ML) models, missing data represents a recurring challenge. Missing data is often addressed when training ML models. But missing data also needs to be addressed when deciding predictions and when explaining those predictions. Missing data represents an opportunity to partially specify the inputs of the prediction to be explained. This paper studies the computation of logic-based explanations in the presence of partially specified inputs. The paper shows that most of the algorithms proposed in recent years for computing logic-based explanations can be generalized for computing explanations given the partially specified inputs. One related result is that the complexity of computing logic-based explanations remains unchanged. A similar result is proved in the case of logic-based explainability subject to input constraints. Furthermore, the proposed solution for computing explanations given partially specified inputs is applied to classifiers obtained from well-known public datasets, thereby illustrating a number of novel explainability use cases.
AIMar 31
Rigorous Explanations for Tree EnsemblesYacine Izza, Alexey Ignatiev, Xuanxiang Huang et al.
Tree ensembles (TEs) find a multitude of practical applications. They represent one of the most general and accurate classes of machine learning methods. While they are typically quite concise in representation, their operation remains inscrutable to human decision makers. One solution to build trust in the operation of TEs is to automatically identify explanations for the predictions made. Evidently, we can only achieve trust using explanations, if those explanations are rigorous, that is truly reflect properties of the underlying predictor they explain This paper investigates the computation of rigorously-defined, logically-sound explanations for the concrete case of two well-known examples of tree ensembles, namely random forests and boosted trees.
AINov 5, 2025
Uncovering Bugs in Formal Explainers: A Case Study with PyXAIXuanxiang Huang, Yacine Izza, Alexey Ignatiev et al.
Formal explainable artificial intelligence (XAI) offers unique theoretical guarantees of rigor when compared to other non-formal methods of explainability. However, little attention has been given to the validation of practical implementations of formal explainers. This paper develops a novel methodology for validating formal explainers and reports on the assessment of the publicly available formal explainer PyXAI. The paper documents the existence of incorrect explanations computed by PyXAI on most of the datasets analyzed in the experiments, thereby confirming the importance of the proposed novel methodology for the validation of formal explainers.
AISep 30, 2023
Refutation of Shapley Values for XAI -- Additional EvidenceXuanxiang Huang, Joao Marques-Silva
Recent work demonstrated the inadequacy of Shapley values for explainable artificial intelligence (XAI). Although to disprove a theory a single counterexample suffices, a possible criticism of earlier work is that the focus was solely on Boolean classifiers. To address such possible criticism, this paper demonstrates the inadequacy of Shapley values for families of classifiers where features are not boolean, but also for families of classifiers for which multiple classes can be picked. Furthermore, the paper shows that the features changed in any minimal $l_0$ distance adversarial examples do not include irrelevant features, thus offering further arguments regarding the inadequacy of Shapley values for XAI.
AIApr 17
Towards Rigorous Explainability by Feature AttributionOlivier Létoffé, Xuanxiang Huang, Joao Marques-Silva
For around a decade, non-symbolic methods have been the option of choice when explaining complex machine learning (ML) models. Unfortunately, such methods lack rigor and can mislead human decision-makers. In high-stakes uses of ML, the lack of rigor is especially problematic. One prime example of provable lack of rigor is the adoption of Shapley values in explainable artificial intelligence (XAI), with the tool SHAP being a ubiquitous example. This paper overviews the ongoing efforts towards using rigorous symbolic methods of XAI as an alternative to non-rigorous non-symbolic approaches, concretely for assigning relative feature importance.
AIMay 20, 2024
From SHAP Scores to Feature Importance ScoresOlivier Letoffe, Xuanxiang Huang, Nicholas Asher et al.
A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.
LGMay 14, 2024
Distance-Restricted Explanations: Theoretical Underpinnings & Efficient ImplementationYacine Izza, Xuanxiang Huang, Antonio Morgado et al.
The uses of machine learning (ML) have snowballed in recent years. In many cases, ML models are highly complex, and their operation is beyond the understanding of human decision-makers. Nevertheless, some uses of ML models involve high-stakes and safety-critical applications. Explainable artificial intelligence (XAI) aims to help human decision-makers in understanding the operation of such complex ML models, thus eliciting trust in their operation. Unfortunately, the majority of past XAI work is based on informal approaches, that offer no guarantees of rigor. Unsurprisingly, there exists comprehensive experimental and theoretical evidence confirming that informal methods of XAI can provide human-decision makers with erroneous information. Logic-based XAI represents a rigorous approach to explainability; it is model-based and offers the strongest guarantees of rigor of computed explanations. However, a well-known drawback of logic-based XAI is the complexity of logic reasoning, especially for highly complex ML models. Recent work proposed distance-restricted explanations, i.e. explanations that are rigorous provided the distance to a given input is small enough. Distance-restricted explainability is tightly related with adversarial robustness, and it has been shown to scale for moderately complex ML models, but the number of inputs still represents a key limiting factor. This paper investigates novel algorithms for scaling up the performance of logic-based explainers when computing and enumerating ML model explanations with a large number of inputs.
LGDec 19, 2023
Locally-Minimal Probabilistic ExplanationsYacine Izza, Kuldeep S. Meel, Joao Marques-Silva
Explainable Artificial Intelligence (XAI) is widely regarding as a cornerstone of trustworthy AI. Unfortunately, most work on XAI offers no guarantees of rigor. In high-stakes domains, e.g. uses of AI that impact humans, the lack of rigor of explanations can have disastrous consequences. Formal abductive explanations offer crucial guarantees of rigor and so are of interest in high-stakes uses of machine learning (ML). One drawback of abductive explanations is explanation size, justified by the cognitive limits of human decision-makers. Probabilistic abductive explanations (PAXps) address this limitation, but their theoretical and practical complexity makes their exact computation most often unrealistic. This paper proposes novel efficient algorithms for the computation of locally-minimal PXAps, which offer high-quality approximations of PXAps in practice. The experimental results demonstrate the practical efficiency of the proposed algorithms.
LGDec 24, 2024
Efficient Contrastive Explanations on DemandYacine Izza, Joao Marques-Silva
Recent work revealed a tight connection between adversarial robustness and restricted forms of symbolic explanations, namely distance-based (formal) explanations. This connection is significant because it represents a first step towards making the computation of symbolic explanations as efficient as deciding the existence of adversarial examples, especially for highly complex machine learning (ML) models. However, a major performance bottleneck remains, because of the very large number of features that ML models may possess, in particular for deep neural networks. This paper proposes novel algorithms to compute the so-called contrastive explanations for ML models with a large number of features, by leveraging on adversarial robustness. Furthermore, the paper also proposes novel algorithms for listing explanations and finding smallest contrastive explanations. The experimental results demonstrate the performance gains achieved by the novel algorithms proposed in this paper.
LGDec 18, 2024
SHAP scores fail pervasively even when Lipschitz succeedsOlivier Letoffe, Xuanxiang Huang, Joao Marques-Silva
The ubiquitous use of Shapley values in eXplainable AI (XAI) has been triggered by the tool SHAP, and as a result are commonly referred to as SHAP scores. Recent work devised examples of machine learning (ML) classifiers for which the computed SHAP scores are thoroughly unsatisfactory, by allowing human decision-makers to be misled. Nevertheless, such examples could be perceived as somewhat artificial, since the selected classes must be interpreted as numeric. Furthermore, it was unclear how general were the issues identified with SHAP scores. This paper answers these criticisms. First, the paper shows that for Boolean classifiers there are arbitrarily many examples for which the SHAP scores must be deemed unsatisfactory. Second, the paper shows that the issues with SHAP scores are also observed in the case of regression models. In addition, the paper studies the class of regression models that respect Lipschitz continuity, a measure of a function's rate of change that finds important recent uses in ML, including model robustness. Concretely, the paper shows that the issues with SHAP scores occur even for regression models that respect Lipschitz continuity. Finally, the paper shows that the same issues are guaranteed to exist for arbitrarily differentiable regression models.
LGDec 18, 2023
The Pros and Cons of Adversarial RobustnessYacine Izza, Joao Marques-Silva
Robustness is widely regarded as a fundamental problem in the analysis of machine learning (ML) models. Most often robustness equates with deciding the non-existence of adversarial examples, where adversarial examples denote situations where small changes on some inputs cause a change in the prediction. The perceived importance of ML model robustness explains the continued progress observed for most of the last decade. Whereas robustness is often assessed locally, i.e. given some target point in feature space, robustness can also be defined globally, i.e. where any point in feature space can be considered. The importance of ML model robustness is illustrated for example by the existence of competitions evaluating the progress of robustness tools, namely in the case of neural networks (NNs) but also by efforts towards robustness certification. More recently, robustness tools have also been used for computing rigorous explanations of ML models. In contrast with the observed successes of robustness, this paper uncovers some limitations with existing definitions of robustness, both global and local, but also with efforts towards robustness certification. The paper also investigates uses of adversarial examples besides those related with robustness.
AISep 22, 2025
Efficient & Correct Predictive Equivalence for Decision TreesJoao Marques-Silva, Alexey Ignatiev
The Rashomon set of decision trees (DTs) finds importance uses. Recent work showed that DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set. Such redundancy is undesirable. For example, feature importance based on the Rashomon set becomes inaccurate due the existence of predictive equivalent DTs, i.e. DTs with the same prediction for every possible input. In recent work, McTavish et al. proposed solutions for several computational problems related with DTs, including that of deciding predictive equivalent DTs. The approach of McTavish et al. consists of applying the well-known method of Quine-McCluskey (QM) for obtaining minimum-size DNF (disjunctive normal form) representations of DTs, which are then used for comparing DTs for predictive equivalence. Furthermore, the minimum-size DNF representation was also applied to computing explanations for the predictions made by DTs, and to finding predictions in the presence of missing data. However, the problem of formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential running time and space. This paper first demonstrates that there exist decision trees that trigger the worst-case exponential running time and space of the QM method. Second, the paper shows that the QM method may incorrectly decide predictive equivalence, if two key constraints are not respected, and one may be difficult to formally guarantee. Third, the paper shows that any of the problems to which the smallest DNF representation has been applied to can be solved in polynomial time, in the size of the DT. The experiments confirm that, for DTs for which the worst-case of the QM method is triggered, the algorithms proposed in this paper are orders of magnitude faster than the ones proposed by McTavish et al.
AIAug 16, 2025
Rigorous Feature Importance Scores based on Shapley Value and Banzhaf IndexXuanxiang Huang, Olivier Létoffé, Joao Marques-Silva
Feature attribution methods based on game theory are ubiquitous in the field of eXplainable Artificial Intelligence (XAI). Recent works proposed rigorous feature attribution using logic-based explanations, specifically targeting high-stakes uses of machine learning (ML) models. Typically, such works exploit weak abductive explanation (WAXp) as the characteristic function to assign importance to features. However, one possible downside is that the contribution of non-WAXp sets is neglected. In fact, non-WAXp sets can also convey important information, because of the relationship between formal explanations (XPs) and adversarial examples (AExs). Accordingly, this paper leverages Shapley value and Banzhaf index to devise two novel feature importance scores. We take into account non-WAXp sets when computing feature contribution, and the novel scores quantify how effective each feature is at excluding AExs. Furthermore, the paper identifies properties and studies the computational complexity of the proposed scores.
AIJul 10, 2025
On Trustworthy Rule-Based Models and ExplanationsMohamed Siala, Jordi Planes, Joao Marques-Silva
A task of interest in machine learning (ML) is that of ascribing explanations to the predictions made by ML models. Furthermore, in domains deemed high risk, the rigor of explanations is paramount. Indeed, incorrect explanations can and will mislead human decision makers. As a result, and even if interpretability is acknowledged as an elusive concept, so-called interpretable models are employed ubiquitously in high-risk uses of ML and data mining (DM). This is the case for rule-based ML models, which encompass decision trees, diagrams, sets and lists. This paper relates explanations with well-known undesired facets of rule-based ML models, which include negative overlap and several forms of redundancy. The paper develops algorithms for the analysis of these undesired facets of rule-based systems, and concludes that well-known and widely used tools for learning rule-based ML models will induce rule sets that exhibit one or more negative facets.
AIJan 20, 2025
The Explanation Game -- Rekindled (Extended Version)Joao Marques-Silva, Xuanxiang Huang, Olivier Letoffe
Recent work demonstrated the existence of critical flaws in the current use of Shapley values in explainable AI (XAI), i.e. the so-called SHAP scores. These flaws are significant in that the scores provided to a human decision-maker can be misleading. Although these negative results might appear to indicate that Shapley values ought not be used in XAI, this paper argues otherwise. Concretely, this paper proposes a novel definition of SHAP scores that overcomes existing flaws. Furthermore, the paper outlines a practically efficient solution for the rigorous estimation of the novel SHAP scores. Preliminary experimental results confirm our claims, and further underscore the flaws of the current SHAP scores.
AIJun 4, 2024
Logic-Based Explainability: Past, Present & FutureJoao Marques-Silva
In recent years, the impact of machine learning (ML) and artificial intelligence (AI) in society has been absolutely remarkable. This impact is expected to continue in the foreseeable future. However,the adoption of AI/ML is also a cause of grave concern. The operation of the most advances AI/ML models is often beyond the grasp of human decision makers. As a result, decisions that impact humans may not be understood and may lack rigorous validation. Explainable AI (XAI) is concerned with providing human decision-makers with understandable explanations for the predictions made by ML models. As a result, XAI is a cornerstone of trustworthy AI. Despite its strategic importance, most work on XAI lacks rigor, and so its use in high-risk or safety-critical domains serves to foster distrust instead of contributing to build much-needed trust. Logic-based XAI has recently emerged as a rigorous alternative to those other non-rigorous methods of XAI. This paper provides a technical survey of logic-based XAI, its origins, the current topics of research, and emerging future topics of research. The paper also highlights the many myths that pervade non-rigorous approaches for XAI.
LGApr 30, 2024
Towards trustable SHAP scoresOlivier Letoffe, Xuanxiang Huang, Joao Marques-Silva
SHAP scores represent the proposed use of the well-known Shapley values in eXplainable Artificial Intelligence (XAI). Recent work has shown that the exact computation of SHAP scores can produce unsatisfactory results. Concretely, for some ML models, SHAP scores will mislead with respect to relative feature influence. To address these limitations, recently proposed alternatives exploit different axiomatic aggregations, all of which are defined in terms of abductive explanations. However, the proposed axiomatic aggregations are not Shapley values. This paper investigates how SHAP scores can be modified so as to extend axiomatic aggregations to the case of Shapley values in XAI. More importantly, the proposed new definition of SHAP scores avoids all the known cases where unsatisfactory results have been identified. The paper also characterizes the complexity of computing the novel definition of SHAP scores, highlighting families of classifiers for which computing these scores is tractable. Furthermore, the paper proposes modifications to the existing implementations of SHAP scores. These modifications eliminate some of the known limitations of SHAP scores, and have negligible impact in terms of performance.
AIMay 13, 2023
Disproving XAI Myths with Formal Methods -- Initial ResultsJoao Marques-Silva
The advances in Machine Learning (ML) in recent years have been both impressive and far-reaching. However, the deployment of ML models is still impaired by a lack of trust in how the best-performing ML models make predictions. The issue of lack of trust is even more acute in the uses of ML models in high-risk or safety-critical domains. eXplainable artificial intelligence (XAI) is at the core of ongoing efforts for delivering trustworthy AI. Unfortunately, XAI is riddled with critical misconceptions, that foster distrust instead of building trust. This paper details some of the most visible misconceptions in XAI, and shows how formal methods have been used, both to disprove those misconceptions, but also to devise practically effective alternatives.
AIFeb 15, 2022
On Deciding Feature Membership in Explanations of SDD & Related ClassifiersXuanxiang Huang, Joao Marques-Silva
When reasoning about explanations of Machine Learning (ML) classifiers, a pertinent query is to decide whether some sensitive features can serve for explaining a given prediction. Recent work showed that the feature membership problem (FMP) is hard for $Σ_2^P$ for a broad class of classifiers. In contrast, this paper shows that for a number of families of classifiers, FMP is in NP. Concretely, the paper proves that any classifier for which an explanation can be computed in polynomial time, then deciding feature membership in an explanation can be decided with one NP oracle call. The paper then proposes propositional encodings for classifiers represented with Sentential Decision Diagrams (SDDs) and for other related propositional languages. The experimental results confirm the practical efficiency of the proposed approach.
AIJul 4, 2021
Efficient Explanations for Knowledge Compilation LanguagesXuanxiang Huang, Yacine Izza, Alexey Ignatiev et al.
Knowledge compilation (KC) languages find a growing number of practical uses, including in Constraint Programming (CP) and in Machine Learning (ML). In most applications, one natural question is how to explain the decisions made by models represented by a KC language. This paper shows that for many of the best known KC languages, well-known classes of explanations can be computed in polynomial time. These classes include deterministic decomposable negation normal form (d-DNNF), and so any KC language that is strictly less succinct than d-DNNF. Furthermore, the paper also investigates the conditions under which polynomial time computation of explanations can be extended to KC languages more succinct than d-DNNF.
AIJun 2, 2021
On Efficiently Explaining Graph-Based ClassifiersXuanxiang Huang, Yacine Izza, Alexey Ignatiev et al.
Recent work has shown that not only decision trees (DTs) may not be interpretable but also proposed a polynomial-time algorithm for computing one PI-explanation of a DT. This paper shows that for a wide range of classifiers, globally referred to as decision graphs, and which include decision trees and binary decision diagrams, but also their multi-valued variants, there exist polynomial-time algorithms for computing one PI-explanation. In addition, the paper also proposes a polynomial-time algorithm for computing one contrastive explanation. These novel algorithms build on explanation graphs (XpG's). XpG's denote a graph representation that enables both theoretical and practically efficient computation of explanations for decision graphs. Furthermore, the paper proposes a practically efficient solution for the enumeration of explanations, and studies the complexity of deciding whether a given feature is included in some explanation. For the concrete case of decision trees, the paper shows that the set of all contrastive explanations can be enumerated in polynomial time. Finally, the experimental results validate the practical applicability of the algorithms proposed in the paper on a wide range of publicly available benchmarks.
LGJun 1, 2021
Efficient Explanations With Relevant SetsYacine Izza, Alexey Ignatiev, Nina Narodytska et al.
Recent work proposed $δ$-relevant inputs (or sets) as a probabilistic explanation for the predictions made by a classifier on a given input. $δ$-relevant sets are significant because they serve to relate (model-agnostic) Anchors with (model-accurate) PI- explanations, among other explanation approaches. Unfortunately, the computation of smallest size $δ$-relevant sets is complete for ${NP}^{PP}$, rendering their computation largely infeasible in practice. This paper investigates solutions for tackling the practical limitations of $δ$-relevant sets. First, the paper alternatively considers the computation of subset-minimal sets. Second, the paper studies concrete families of classifiers, including decision trees among others. For these cases, the paper shows that the computation of subset-minimal $δ$-relevant sets is in NP, and can be solved with a polynomial number of calls to an NP oracle. The experimental evaluation compares the proposed approach with heuristic explainers for the concrete case of the classifiers studied in the paper, and confirms the advantage of the proposed solution over the state of the art.
LGJun 1, 2021
Explanations for Monotonic ClassifiersJoao Marques-Silva, Thomas Gerspacher, Martin Cooper et al.
In many classification tasks there is a requirement of monotonicity. Concretely, if all else remains constant, increasing (resp. decreasing) the value of one or more features must not decrease (resp. increase) the value of the prediction. Despite comprehensive efforts on learning monotonic classifiers, dedicated approaches for explaining monotonic classifiers are scarce and classifier-specific. This paper describes novel algorithms for the computation of one formal explanation of a (black-box) monotonic classifier. These novel algorithms are polynomial in the run time complexity of the classifier and the number of features. Furthermore, the paper presents a practically efficient model-agnostic algorithm for enumerating formal explanations.
LGMay 21, 2021
On Explaining Random Forests with SATYacine Izza, Joao Marques-Silva
Random Forest (RFs) are among the most widely used Machine Learning (ML) classifiers. Even though RFs are not interpretable, there are no dedicated non-heuristic approaches for computing explanations of RFs. Moreover, there is recent work on polynomial algorithms for explaining ML models, including naive Bayes classifiers. Hence, one question is whether finding explanations of RFs can be solved in polynomial time. This paper answers this question negatively, by proving that computing one PI-explanation of an RF is D^P-complete. Furthermore, the paper proposes a propositional encoding for computing explanations of RFs, thus enabling finding PI-explanations with a SAT solver. This contrasts with earlier work on explaining boosted trees (BTs) and neural networks (NNs), which requires encodings based on SMT/MILP. Experimental results, obtained on a wide range of publicly available datasets, demontrate that the proposed SAT-based approach scales to RFs of sizes common in practical applications. Perhaps more importantly, the experimental results demonstrate that, for the vast majority of examples considered, the SAT-based approach proposed in this paper significantly outperforms existing heuristic approaches.
AIMay 14, 2021
SAT-Based Rigorous Explanations for Decision ListsAlexey Ignatiev, Joao Marques-Silva
Decision lists (DLs) find a wide range of uses for classification problems in Machine Learning (ML), being implemented in a number of ML frameworks. DLs are often perceived as interpretable. However, building on recent results for decision trees (DTs), we argue that interpretability is an elusive goal for some DLs. As a result, for some uses of DLs, it will be important to compute (rigorous) explanations. Unfortunately, and in clear contrast with the case of DTs, this paper shows that computing explanations for DLs is computationally hard. Motivated by this result, the paper proposes propositional encodings for computing abductive explanations (AXps) and contrastive explanations (CXps) of DLs. Furthermore, the paper investigates the practical efficiency of a MARCO-like approach for enumerating explanations. The experimental results demonstrate that, for DLs used in practical settings, the use of SAT oracles offers a very efficient solution, and that complete enumeration of explanations is most often feasible.
AIMar 18, 2021
MILP for the Multi-objective VM Reassignment ProblemTakfarinas Saber, Anthony Ventresque, Joao Marques-Silva et al.
Machine Reassignment is a challenging problem for constraint programming (CP) and mixed-integer linear programming (MILP) approaches, especially given the size of data centres. The multi-objective version of the Machine Reassignment Problem is even more challenging and it seems unlikely for CP or MILP to obtain good results in this context. As a result, the first approaches to address this problem have been based on other optimisation methods, including metaheuristics. In this paper we study under which conditions a mixed-integer optimisation solver, such as IBM ILOG CPLEX, can be used for the Multi-objective Machine Reassignment Problem. We show that it is useful only for small or medium-scale data centres and with some relaxations, such as an optimality tolerance gap and a limited number of directions explored in the search space. Building on this study, we also investigate a hybrid approach, feeding a metaheuristic with the results of CPLEX, and we show that the gains are important in terms of quality of the set of Pareto solutions (+126.9% against the metaheuristic alone and +17.8% against CPLEX alone) and number of solutions (8.9 times more than CPLEX), while the processing time increases only by 6% in comparison to CPLEX for execution times larger than 100 seconds.
AIFeb 3, 2021
A Scalable Two Stage Approach to Computing Optimal Decision SetsAlexey Ignatiev, Edward Lam, Peter J. Stuckey et al.
Machine learning (ML) is ubiquitous in modern life. Since it is being deployed in technologies that affect our privacy and safety, it is often crucial to understand the reasoning behind its decisions, warranting the need for explainable AI. Rule-based models, such as decision trees, decision lists, and decision sets, are conventionally deemed to be the most interpretable. Recent work uses propositional satisfiability (SAT) solving (and its optimization variants) to generate minimum-size decision sets. Motivated by limited practical scalability of these earlier methods, this paper proposes a novel approach to learn minimum-size decision sets by enumerating individual rules of the target decision set independently of each other, and then solving a set cover problem to select a subset of rules. The approach makes use of modern maximum satisfiability and integer linear programming technologies. Experiments on a wide range of publicly available datasets demonstrate the advantage of the new approach over the state of the art in SAT-based decision set learning.
LGDec 21, 2020
On Relating 'Why?' and 'Why Not?' ExplanationsAlexey Ignatiev, Nina Narodytska, Nicholas Asher et al.
Explanations of Machine Learning (ML) models often address a 'Why?' question. Such explanations can be related with selecting feature-value pairs which are sufficient for the prediction. Recent work has investigated explanations that address a 'Why Not?' question, i.e. finding a change of feature values that guarantee a change of prediction. Given their goals, these two forms of explaining predictions of ML models appear to be mostly unrelated. However, this paper demonstrates otherwise, and establishes a rigorous formal relationship between 'Why?' and 'Why Not?' explanations. Concretely, the paper proves that, for any given instance, 'Why?' explanations are minimal hitting sets of 'Why Not?' explanations and vice-versa. Furthermore, the paper devises novel algorithms for extracting and enumerating both forms of explanations.
LGOct 21, 2020
On Explaining Decision TreesYacine Izza, Alexey Ignatiev, Joao Marques-Silva
Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This is informally motivated by paths in DTs being often much smaller than the total number of features. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation, i.e. a subset-minimal set of feature values that entails the prediction. As a result, the paper proposes a novel model for computing PI-explanations of DTs, which enables computing one PI-explanation in polynomial time. Moreover, it is shown that enumeration of PI-explanations can be reduced to the enumeration of minimal hitting sets. Experimental results were obtained on a wide range of publicly available datasets with well-known DT-learning tools, and confirm that in most cases DTs have paths that are proper supersets of PI-explanations.
LGAug 13, 2020
Explaining Naive Bayes and Other Linear Classifiers with Polynomial Time and DelayJoao Marques-Silva, Thomas Gerspacher, Martin C. Cooper et al.
Recent work proposed the computation of so-called PI-explanations of Naive Bayes Classifiers (NBCs). PI-explanations are subset-minimal sets of feature-value pairs that are sufficient for the prediction, and have been computed with state-of-the-art exact algorithms that are worst-case exponential in time and space. In contrast, we show that the computation of one PI-explanation for an NBC can be achieved in log-linear time, and that the same result also applies to the more general class of linear classifiers. Furthermore, we show that the enumeration of PI-explanations can be obtained with polynomial delay. Experimental results demonstrate the performance gains of the new algorithms when compared with earlier work. The experimental results also investigate ways to measure the quality of heuristic explanations
LGJul 4, 2019
On Validating, Repairing and Refining Heuristic ML ExplanationsAlexey Ignatiev, Nina Narodytska, Joao Marques-Silva
Recent years have witnessed a fast-growing interest in computing explanations for Machine Learning (ML) models predictions. For non-interpretable ML models, the most commonly used approaches for computing explanations are heuristic in nature. In contrast, recent work proposed rigorous approaches for computing explanations, which hold for a given ML model and prediction over the entire instance space. This paper extends earlier work to the case of boosted trees and assesses the quality of explanations obtained with state-of-the-art heuristic approaches. On most of the datasets considered, and for the vast majority of instances, the explanations obtained with heuristic approaches are shown to be inadequate when the entire instance space is (implicitly) considered.
AINov 26, 2018
Abduction-Based Explanations for Machine Learning ModelsAlexey Ignatiev, Nina Narodytska, Joao Marques-Silva
The growing range of applications of Machine Learning (ML) in a multitude of settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Most earlier work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This paper develops a constraint-agnostic solution for computing explanations for any ML model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The experimental results, obtained on well-known datasets, validate the scalability of the proposed approach as well as the quality of the computed solutions.
AIApr 27, 2016
Propositional Abduction with Implicit Hitting SetsAlexey Ignatiev, Antonio Morgado, Joao Marques-Silva
Logic-based abduction finds important applications in artificial intelligence and related areas. One application example is in finding explanations for observed phenomena. Propositional abduction is a restriction of abduction to the propositional domain, and complexity-wise is in the second level of the polynomial hierarchy. Recent work has shown that exploiting implicit hitting sets and propositional satisfiability (SAT) solvers provides an efficient approach for propositional abduction. This paper investigates this earlier work and proposes a number of algorithmic improvements. These improvements are shown to yield exponential reductions in the number of SAT solver calls. More importantly, the experimental results show significant performance improvements compared to the the best approaches for propositional abduction.
AIOct 8, 2013
SAT-based Preprocessing for MaxSAT (extended version)Anton Belov, Antonio Morgado, Joao Marques-Silva
State-of-the-art algorithms for industrial instances of MaxSAT problem rely on iterative calls to a SAT solver. Preprocessing is crucial for the acceleration of SAT solving, and the key preprocessing techniques rely on the application of resolution and subsumption elimination. Additionally, satisfiability-preserving clause elimination procedures are often used. Since MaxSAT computation typically involves a large number of SAT calls, we are interested in whether an input instance to a MaxSAT problem can be preprocessed up-front, i.e. prior to running the MaxSAT solver, rather than (or, in addition to) during each iterative SAT solver call. The key requirement in this setting is that the preprocessing has to be sound, i.e. so that the solution can be reconstructed correctly and efficiently after the execution of a MaxSAT algorithm on the preprocessed instance. While, as we demonstrate in this paper, certain clause elimination procedures are sound for MaxSAT, it is well-known that this is not the case for resolution and subsumption elimination. In this paper we show how to adapt these preprocessing techniques to MaxSAT. To achieve this we recast the MaxSAT problem in a recently introduced labelled-CNF framework, and show that within the framework the preprocessing techniques can be applied soundly. Furthermore, we show that MaxSAT algorithms restated in the framework have a natural implementation on top of an incremental SAT solver. We evaluate the prototype implementation of a MaxSAT algorithm WMSU1 in this setting, demonstrate the effectiveness of preprocessing, and show overall improvement with respect to non-incremental versions of the algorithm on some classes of problems.
LOJul 5, 2012
Generalizing Redundancy in Propositional Logic: Foundations and Hitting Sets DualityAnton Belov, Joao Marques-Silva
Detection and elimination of redundant clauses from propositional formulas in Conjunctive Normal Form (CNF) is a fundamental problem with numerous application domains, including AI, and has been the subject of extensive research. Moreover, a number of recent applications motivated various extensions of this problem. For example, unsatisfiable formulas partitioned into disjoint subsets of clauses (so-called groups) often need to be simplified by removing redundant groups, or may contain redundant variables, rather than clauses. In this report we present a generalized theoretical framework of labelled CNF formulas that unifies various extensions of the redundancy detection and removal problem and allows to derive a number of results that subsume and extend previous work. The follow-up reports contain a number of additional theoretical results and algorithms for various computational problems in the context of the proposed framework.