DifCluE: Generating Counterfactual Explanations with Diffusion Autoencoders and modal clustering
This addresses the problem of model interpretability for users needing diverse explanations, though it appears incremental as it builds on existing diffusion models and clustering techniques.
The paper tackles the challenge of generating multiple distinct counterfactual explanations for different modes within a class by using a Diffusion Autoencoder and latent space clustering, resulting in DifCluE outperforming state-of-the-art methods.
Generating multiple counterfactual explanations for different modes within a class presents a significant challenge, as these modes are distinct yet converge under the same classification. Diffusion probabilistic models (DPMs) have demonstrated a strong ability to capture the underlying modes of data distributions. In this paper, we harness the power of a Diffusion Autoencoder to generate multiple distinct counterfactual explanations. By clustering in the latent space, we uncover the directions corresponding to the different modes within a class, enabling the generation of diverse and meaningful counterfactuals. We introduce a novel methodology, DifCluE, which consistently identifies these modes and produces more reliable counterfactual explanations. Our experimental results demonstrate that DifCluE outperforms the current state-of-the-art in generating multiple counterfactual explanations, offering a significant advancement in model interpretability.