CVLGMar 29, 2022

Diffusion Models for Counterfactual Explanations

arXiv:2203.15636v184 citationsh-index: 50
Originality Incremental advance
AI Analysis

This work addresses the need for more explainable AI in image classification, though it appears incremental as it builds on existing diffusion models for a known bottleneck in counterfactual generation.

The paper tackles the problem of generating counterfactual explanations for image classifiers by proposing DiME, a method using diffusion models, which surpasses previous state-of-the-art results on 5 out of 6 metrics on the CelebA dataset.

Counterfactual explanations have shown promising results as a post-hoc framework to make image classifiers more explainable. In this paper, we propose DiME, a method allowing the generation of counterfactual images using the recent diffusion models. By leveraging the guided generative diffusion process, our proposed methodology shows how to use the gradients of the target classifier to generate counterfactual explanations of input instances. Further, we analyze current approaches to evaluate spurious correlations and extend the evaluation measurements by proposing a new metric: Correlation Difference. Our experimental validations show that the proposed algorithm surpasses previous State-of-the-Art results on 5 out of 6 metrics on CelebA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes