CVAIAug 31, 2021

PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining CNNs

arXiv:2108.13828v119 citations
Originality Incremental advance
AI Analysis

This addresses the need for trustworthy AI by providing interpretable explanations for CNN users, though it is incremental as it builds on existing concept extraction methods.

The paper tackles the problem of explaining deep CNN predictions by introducing PACE, a posthoc method that automatically extracts class-specific discriminative concepts from images, with human experiments showing over 72% of these concepts are interpretable.

Deep CNNs, though have achieved the state of the art performance in image classification tasks, remain a black-box to a human using them. There is a growing interest in explaining the working of these deep models to improve their trustworthiness. In this paper, we introduce a Posthoc Architecture-agnostic Concept Extractor (PACE) that automatically extracts smaller sub-regions of the image called concepts relevant to the black-box prediction. PACE tightly integrates the faithfulness of the explanatory framework to the black-box model. To the best of our knowledge, this is the first work that extracts class-specific discriminative concepts in a posthoc manner automatically. The PACE framework is used to generate explanations for two different CNN architectures trained for classifying the AWA2 and Imagenet-Birds datasets. Extensive human subject experiments are conducted to validate the human interpretability and consistency of the explanations extracted by PACE. The results from these experiments suggest that over 72% of the concepts extracted by PACE are human interpretable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes