Actionable Counterfactual Explanations Using Bayesian Networks and Path Planning with Applications to Environmental Quality Improvement
This work addresses the need for interpretable and privacy-preserving counterfactual explanations in high-stakes domains like environmental policy, where fairness and data sensitivity are critical, though it is incremental in method.
The authors tackled the problem of generating actionable counterfactual explanations for machine learning models by developing a method that uses Bayesian networks for density estimation and path planning algorithms, avoiding direct use of sensitive training data. Their approach outperformed state-of-the-art algorithms on a synthetic benchmark of 15 datasets, finding more actionable and simpler counterfactuals, and was applied to a real-world EPA dataset to improve environmental quality policies while ensuring equity.
Counterfactual explanations study what should have changed in order to get an alternative result, enabling end-users to understand machine learning mechanisms with counterexamples. Actionability is defined as the ability to transform the original case to be explained into a counterfactual one. We develop a method for actionable counterfactual explanations that, unlike predecessors, does not directly leverage training data. Rather, data is only used to learn a density estimator, creating a search landscape in which to apply path planning algorithms to solve the problem and masking the endogenous data, which can be sensitive or private. We put special focus on estimating the data density using Bayesian networks, demonstrating how their enhanced interpretability is useful in high-stakes scenarios in which fairness is raising concern. Using a synthetic benchmark comprised of 15 datasets, our proposal finds more actionable and simpler counterfactuals than the current state-of-the-art algorithms. We also test our algorithm with a real-world Environmental Protection Agency dataset, facilitating a more efficient and equitable study of policies to improve the quality of life in United States of America counties. Our proposal captures the interaction of variables, ensuring equity in decisions, as policies to improve certain domains of study (air, water quality, etc.) can be detrimental in others. In particular, the sociodemographic domain is often involved, where we find important variables related to the ongoing housing crisis that can potentially have a severe negative impact on communities.