LGAug 31, 2022

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai, Chih-Kuan Yeh, Pradeep Ravikumar, Neil Y. C. Lin, Cho-Jui Hsieh

arXiv:2208.14966v216.124 citationsh-index: 84Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate and intuitive interpretations of AI models for users, though it is incremental as it builds upon existing concept-based methods.

The paper tackled the problem of concept-based interpretation for black-box models by proposing Concept Gradient (CG), which extends interpretation beyond linear assumptions, and demonstrated that CG outperforms the widely used Concept Activation Vector (CAV) in toy examples and real-world datasets.

Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The linear separability is usually implicitly assumed but does not hold true in general. In this work, we started from the original intent of concept-based interpretation and proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically evaluate how a small change of concept affecting the model's prediction, which leads to an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in both toy examples and real world datasets.

View on arXiv PDF Code

Similar