LGAPMar 16, 2021

Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties

arXiv:2103.08951v154 citations
AI Analysis

This work addresses the need for simple and interpretable explanations in safety-critical applications like medical domains, though it is incremental as it builds on existing counterfactual explanation methods.

The authors tackled the problem of generating interpretable counterfactual explanations for machine learning classifiers by introducing a method that uses predictive uncertainty without auxiliary models, resulting in more interpretable explanations according to IM1 scores.

Counterfactual explanations (CEs) are a practical tool for demonstrating why machine learning classifiers make particular decisions. For CEs to be useful, it is important that they are easy for users to interpret. Existing methods for generating interpretable CEs rely on auxiliary generative models, which may not be suitable for complex datasets, and incur engineering overhead. We introduce a simple and fast method for generating interpretable CEs in a white-box setting without an auxiliary model, by using the predictive uncertainty of the classifier. Our experiments show that our proposed algorithm generates more interpretable CEs, according to IM1 scores, than existing methods. Additionally, our approach allows us to estimate the uncertainty of a CE, which may be important in safety-critical applications, such as those in the medical domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes