CVFeb 5, 2025

Concept Based Explanations and Class Contrasting

arXiv:2502.03422v2h-index: 9Has Code
AI Analysis

This work addresses the problem of interpretability in deep learning for researchers and practitioners, but it is incremental as it builds on existing concept-based explanation methods.

The paper tackles the challenge of explaining deep neural networks by introducing a concept-based explanation method that explains predictions for individual classes and contrasts any two classes, achieving 91.1% success in automatically selecting dataset crops to make a model predict a specific class again for 911 out of 1000 classes.

Explaining deep neural networks is challenging, due to their large size and non-linearity. In this paper, we introduce a concept-based explanation method, in order to explain the prediction for an individual class, as well as contrasting any two classes, i.e. explain why the model predicts one class over the other. We test it on several openly available classification models trained on ImageNet1K. We perform both qualitative and quantitative tests. For example, for a ResNet50 model from pytorch model zoo, we can use the explanation for why the model predicts a class 'A' to automatically select four dataset crops where the model does not predict class 'A'. The model then predicts class 'A' again for the newly combined image in 91.1% of the cases (works for 911 out of the 1000 classes). The code including an .ipynb example is available on github: https://github.com/rherdt185/concept-based-explanations-and-class-contrasting

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes