LGJan 26, 2023

Finding Regions of Counterfactual Explanations via Robust Optimization

Donato Maragno, Jannis Kurtz, Tabea E. Röber, Rob Goedhart, Ş. Ilker Birbil, Dick den Hertog

arXiv:2301.11113v319.631 citationsh-index: 39Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of improving explainability and bias detection in classification models for users by offering multiple robust counterfactual options, though it is incremental as it builds on existing methods with a focus on robustness.

The paper tackles the limitation of existing counterfactual explanation methods that provide only one explanation, which may not be feasible for users, by developing an iterative method to compute robust counterfactual explanations that remain valid under feature perturbations, resulting in a region of explanations that allows user choice and is proven efficient for common models like logistic regression and neural networks.

Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this work we derive an iterative method to calculate robust CEs, i.e. CEs that remain valid even after the features are slightly perturbed. To this end, our method provides a whole region of CEs allowing the user to choose a suitable recourse to obtain a desired outcome. We use algorithmic ideas from robust optimization and prove convergence results for the most common machine learning methods including logistic regression, decision trees, random forests, and neural networks. Our experiments show that our method can efficiently generate globally optimal robust CEs for a variety of common data sets and classification models.

View on arXiv PDF Code

Similar