LGAICVMLApr 16, 2019

Counterfactual Visual Explanations

arXiv:1904.07451v2583 citations
AI Analysis

This addresses the need for interpretable AI in vision systems, particularly for teaching humans in fine-grained tasks, though it is incremental as it builds on existing counterfactual explanation methods.

The paper tackles the problem of generating counterfactual visual explanations for image classification systems by identifying spatial regions to swap between images, and finds that users improve at fine-grained bird classification when provided with these explanations alongside training examples.

In this work, we develop a technique to produce counterfactual visual explanations. Given a 'query' image $I$ for which a vision system predicts class $c$, a counterfactual visual explanation identifies how $I$ could change such that the system would output a different specified class $c'$. To do this, we select a 'distractor' image $I'$ that the system predicts as class $c'$ and identify spatial regions in $I$ and $I'$ such that replacing the identified region in $I$ with the identified region in $I'$ would push the system towards classifying $I$ as $c'$. We apply our approach to multiple image classification datasets generating qualitative results showcasing the interpretability and discriminativeness of our counterfactual explanations. To explore the effectiveness of our explanations in teaching humans, we present machine teaching experiments for the task of fine-grained bird classification. We find that users trained to distinguish bird species fare better when given access to counterfactual explanations in addition to training examples.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes