If Only We Had Better Counterfactual Explanations: Five Key Deficits to Rectify in the Evaluation of Counterfactual XAI Techniques
This work highlights critical evaluation gaps in counterfactual XAI techniques, which hinder scientific progress in making AI explanations more reliable and user-friendly.
The paper surveys 100 counterfactual explanation methods in XAI, finding that only 21% have been user tested, and identifies five key deficits in their evaluation, proposing a roadmap with standardized benchmarks to address these issues.
In recent years, there has been an explosion of AI research on counterfactual explanations as a solution to the problem of eXplainable AI (XAI). These explanations seem to offer technical, psychological and legal benefits over other explanation techniques. We survey 100 distinct counterfactual explanation methods reported in the literature. This survey addresses the extent to which these methods have been adequately evaluated, both psychologically and computationally, and quantifies the shortfalls occurring. For instance, only 21% of these methods have been user tested. Five key deficits in the evaluation of these methods are detailed and a roadmap, with standardised benchmark evaluations, is proposed to resolve the issues arising; issues, that currently effectively block scientific progress in this field.