A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations
This work addresses the need for more interpretable and plausible counterfactual explanations in Explainable AI, representing an incremental improvement over existing methods.
The paper tackles the problem of generating synthetic counterfactual explanations that are often invalid, non-sparse, and non-diverse by proposing a method that adapts native counterfactuals from the original dataset to produce sparse, diverse counterfactuals using naturally occurring features, with experiments exploring parametric variations to establish optimal performance conditions.
Counterfactual explanations provide a potentially significant solution to the Explainable AI (XAI) problem, but good, native counterfactuals have been shown to rarely occur in most datasets. Hence, the most popular methods generate synthetic counterfactuals using blind perturbation. However, such methods have several shortcomings: the resulting counterfactuals (i) may not be valid data-points (they often use features that do not naturally occur), (ii) may lack the sparsity of good counterfactuals (if they modify too many features), and (iii) may lack diversity (if the generated counterfactuals are minimal variants of one another). We describe a method designed to overcome these problems, one that adapts native counterfactuals in the original dataset, to generate sparse, diverse synthetic counterfactuals from naturally occurring features. A series of experiments are reported that systematically explore parametric variations of this novel method on common datasets to establish the conditions for optimal performance.