LGAug 31, 2022

Concept Gradient: Concept-based Interpretation Without Linear Assumption

arXiv:2208.14966v224 citationsh-index: 84
Originality Incremental advance
AI Analysis

This work addresses the need for more accurate and intuitive interpretations of AI models for users, though it is incremental as it builds upon existing concept-based methods.

The paper tackled the problem of concept-based interpretation for black-box models by proposing Concept Gradient (CG), which extends interpretation beyond linear assumptions, and demonstrated that CG outperforms the widely used Concept Activation Vector (CAV) in toy examples and real-world datasets.

Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The linear separability is usually implicitly assumed but does not hold true in general. In this work, we started from the original intent of concept-based interpretation and proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically evaluate how a small change of concept affecting the model's prediction, which leads to an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in both toy examples and real world datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes