Explaining Image Classifiers
This work addresses theoretical gaps in explanation methods for image classifiers, which is incremental but important for improving interpretability in AI systems.
The paper critiques Mothilal et al.'s approach to explaining image classifiers by showing it misapplies Halpern's definition, and demonstrates that Halpern's definition can handle explanations of absence and rare events without modification.
We focus on explaining image classifiers, taking the work of Mothilal et al. [2021] (MMTS) as our point of departure. We observe that, although MMTS claim to be using the definition of explanation proposed by Halpern [2016], they do not quite do so. Roughly speaking, Halpern's definition has a necessity clause and a sufficiency clause. MMTS replace the necessity clause by a requirement that, as we show, implies it. Halpern's definition also allows agents to restrict the set of options considered. While these difference may seem minor, as we show, they can have a nontrivial impact on explanations. We also show that, essentially without change, Halpern's definition can handle two issues that have proved difficult for other approaches: explanations of absence (when, for example, an image classifier for tumors outputs "no tumor") and explanations of rare events (such as tumors).