DisCERN:Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods
This work addresses the need for interpretable and actionable explanations in machine learning for end-users, though it appears incremental as it builds on existing counterfactual and feature relevance methods.
The paper tackles the problem of generating counterfactual explanations by identifying the minimum subset of feature changes needed to alter a machine learning outcome, introducing the DisCERN algorithm that uses feature relevance from methods like LIME and SHAP to achieve this. Results from a comparative study on five datasets show that DisCERN effectively minimizes actionable changes compared to the DiCE approach.
Counterfactual explanations focus on "actionable knowledge" to help end-users understand how a machine learning outcome could be changed to a more desirable outcome. For this purpose a counterfactual explainer needs to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes needed to action an output change in the decision is an interesting challenge for counterfactual explainers. The DisCERN algorithm introduced in this paper is a case-based counter-factual explainer. Here counterfactuals are formed by replacing feature values from a nearest unlike neighbour (NUN) until an actionable change is observed. We show how widely adopted feature relevance-based explainers (i.e. LIME, SHAP), can inform DisCERN to identify the minimum subset of "actionable features". We demonstrate our DisCERN algorithm on five datasets in a comparative study with the widely used optimisation-based counterfactual approach DiCE. Our results demonstrate that DisCERN is an effective strategy to minimise actionable changes necessary to create good counterfactual explanations.