Latent-CF: A Simple Baseline for Reverse Counterfactual Explanations
This work provides a baseline for generating counterfactual explanations, which are important for individuals affected by model decisions in regulated environments like fair lending and GDPR.
This paper proposes Latent-CF, a simple method for generating counterfactual explanations by searching in the latent space of an autoencoder. It balances the speed of basic feature gradient descent methods with the sparseness and authenticity of more complex feature space techniques.
In the environment of fair lending laws and the General Data Protection Regulation (GDPR), the ability to explain a model's prediction is of paramount importance. High quality explanations are the first step in assessing fairness. Counterfactuals are valuable tools for explainability. They provide actionable, comprehensible explanations for the individual who is subject to decisions made from the prediction. It is important to find a baseline for producing them. We propose a simple method for generating counterfactuals by using gradient descent to search in the latent space of an autoencoder and benchmark our method against approaches that search for counterfactuals in feature space. Additionally, we implement metrics to concretely evaluate the quality of the counterfactuals. We show that latent space counterfactual generation strikes a balance between the speed of basic feature gradient descent methods and the sparseness and authenticity of counterfactuals generated by more complex feature space oriented techniques.