Counterfactual Explanations for Arbitrary Regression Models
This work addresses the need for interpretable and robust explanations in machine learning, particularly for regression tasks, though it appears incremental by extending existing counterfactual explanation techniques to regression with new algorithmic improvements.
The authors tackled the problem of generating counterfactual explanations for arbitrary regression models by developing a Bayesian optimization-based method that supports constraints like feature sparsity and actionable recourse, achieving high sample-efficiency and precision in evaluations on real-world benchmarks.
We present a new method for counterfactual explanations (CFEs) based on Bayesian optimisation that applies to both classification and regression models. Our method is a globally convergent search algorithm with support for arbitrary regression models and constraints like feature sparsity and actionable recourse, and furthermore can answer multiple counterfactual questions in parallel while learning from previous queries. We formulate CFE search for regression models in a rigorous mathematical framework using differentiable potentials, which resolves robustness issues in threshold-based objectives. We prove that in this framework, (a) verifying the existence of counterfactuals is NP-complete; and (b) that finding instances using such potentials is CLS-complete. We describe a unified algorithm for CFEs using a specialised acquisition function that composes both expected improvement and an exponential-polynomial (EP) family with desirable properties. Our evaluation on real-world benchmark domains demonstrate high sample-efficiency and precision.