AICVLGNEJul 24, 2025

On the Performance of Concept Probing: The Influence of the Data (Extended Version)

arXiv:2507.18550v11 citationsh-index: 3ECAI
Originality Synthesis-oriented
AI Analysis

This work addresses a gap in concept probing research by focusing on data influence, which is incremental but important for improving interpretability in AI.

The paper investigates how the data used to train concept probing models affects their performance in interpreting neural networks for image classification, and it releases concept labels for two popular datasets.

Concept probing has recently garnered increasing interest as a way to help interpret artificial neural networks, dealing both with their typically large size and their subsymbolic nature, which ultimately renders them unfeasible for direct human interpretation. Concept probing works by training additional classifiers to map the internal representations of a model into human-defined concepts of interest, thus allowing humans to peek inside artificial neural networks. Research on concept probing has mainly focused on the model being probed or the probing model itself, paying limited attention to the data required to train such probing models. In this paper, we address this gap. Focusing on concept probing in the context of image classification tasks, we investigate the effect of the data used to train probing models on their performance. We also make available concept labels for two widely used datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes