LGAISep 10, 2020

On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning

arXiv:2009.06399v1123 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better interpretability in AI for users and developers, though it is incremental as it builds on existing counterfactual methods by extending them to image data and introducing semi-factuals.

The paper tackles the challenge of generating plausible counterfactual and semi-factual explanations for deep learning models in computer vision, particularly for black-box CNN classifiers, and shows that their method, PIECE, produces the most plausible explanations on several measures.

There is a growing concern that the recent progress made in AI, especially regarding the predictive competence of deep learning models, will be undermined by a failure to properly explain their operation and outputs. In response to this disquiet counterfactual explanations have become massively popular in eXplainable AI (XAI) due to their proposed computational psychological, and legal benefits. In contrast however, semifactuals, which are a similar way humans commonly explain their reasoning, have surprisingly received no attention. Most counterfactual methods address tabular rather than image data, partly due to the nondiscrete nature of the latter making good counterfactuals difficult to define. Additionally generating plausible looking explanations which lie on the data manifold is another issue which hampers progress. This paper advances a novel method for generating plausible counterfactuals (and semifactuals) for black box CNN classifiers doing computer vision. The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class (hence concretely defining a counterfactual). Two controlled experiments compare this method to others in the literature, showing that PIECE not only generates the most plausible counterfactuals on several measures, but also the best semifactuals.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes