CVLGAug 11, 2023

Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation

arXiv:2308.06100v16 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of inconsistent evaluation for VCE methods, benefiting researchers by providing a standardized approach, though it is incremental as it builds on existing diffusion models.

The authors tackled the lack of systematic evaluation for visual counterfactual explanation (VCE) methods by proposing a framework with metrics, applying it to diffusion-based models on ImageNet, and generating thousands of VCEs to analyze design choices and guide future improvements.

Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. However, it is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. In this work, we propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used. We use this framework to explore the effects of certain crucial design choices in the latest diffusion-based generative models for VCEs of natural image classification (ImageNet). We conduct a battery of ablation-like experiments, generating thousands of VCEs for a suite of classifiers of various complexity, accuracy and robustness. Our findings suggest multiple directions for future advancements and improvements of VCE methods. By sharing our methodology and our approach to tackle the computational challenges of such a study on a limited hardware setup (including the complete code base), we offer a valuable guidance for researchers in the field fostering consistency and transparency in the assessment of counterfactual explanations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes