Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations
This work addresses the need for reliable and user-friendly counterfactual explanations in AI systems to enhance fairness and trust, though it is incremental as it builds on existing methods to improve robustness and flexibility.
The paper tackles the problem of generating counterfactual explanations that are both flexible and robust, proposing a method that constrains changes to abnormal features within normal ranges and models the problem as a Boolean satisfiability issue to minimize modifications, resulting in more robust explanations while maintaining flexibility as demonstrated in experiments on synthetic and real-world datasets.
Counterfactual explanations (CFEs) exemplify how to minimally modify a feature vector to achieve a different prediction for an instance. CFEs can enhance informational fairness and trustworthiness, and provide suggestions for users who receive adverse predictions. However, recent research has shown that multiple CFEs can be offered for the same instance or instances with slight differences. Multiple CFEs provide flexible choices and cover diverse desiderata for user selection. However, individual fairness and model reliability will be damaged if unstable CFEs with different costs are returned. Existing methods fail to exploit flexibility and address the concerns of non-robustness simultaneously. To address these issues, we propose a conceptually simple yet effective solution named Counterfactual Explanations with Minimal Satisfiable Perturbations (CEMSP). Specifically, CEMSP constrains changing values of abnormal features with the help of their semantically meaningful normal ranges. For efficiency, we model the problem as a Boolean satisfiability problem to modify as few features as possible. Additionally, CEMSP is a general framework and can easily accommodate more practical requirements, e.g., casualty and actionability. Compared to existing methods, we conduct comprehensive experiments on both synthetic and real-world datasets to demonstrate that our method provides more robust explanations while preserving flexibility.