CV LGMar 10

Why Does It Look There? Structured Explanations for Image Classification

Jiarui Li, Zixiang Yin, Samuel J Landry, Zhengming Ding, Ramgopal R. Mettu

arXiv:2603.10234v15.8h-index: 32

Predicted impact top 82% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the need for more transparent and trustworthy AI in image classification, though it appears incremental as it builds on existing XAI methods like GradCAM.

The paper tackles the problem of black-box deep learning models by proposing I2X, a framework that builds structured explanations from unstructured interpretability to answer 'why does it look there' in image classification, and demonstrates its effectiveness on MNIST and CIFAR10 while also using it to improve model accuracy through targeted fine-tuning.

Deep learning models achieve remarkable predictive performance, yet their black-box nature limits transparency and trustworthiness. Although numerous explainable artificial intelligence (XAI) methods have been proposed, they primarily provide saliency maps or concepts (i.e., unstructured interpretability). Existing approaches often rely on auxiliary models (\eg, GPT, CLIP) to describe model behavior, thereby compromising faithfulness to the original models. We propose Interpretability to Explainability (I2X), a framework that builds structured explanations directly from unstructured interpretability by quantifying progress at selected checkpoints during training using prototypes extracted from post-hoc XAI methods (e.g., GradCAM). I2X answers the question of "why does it look there" by providing a structured view of both intra- and inter-class decision making during training. Experiments on MNIST and CIFAR10 demonstrate effectiveness of I2X to reveal prototype-based inference process of various image classification models. Moreover, we demonstrate that I2X can be used to improve predictions across different model architectures and datasets: we can identify uncertain prototypes recognized by I2X and then use targeted perturbation of samples that allows fine-tuning to ultimately improve accuracy. Thus, I2X not only faithfully explains model behavior but also provides a practical approach to guide optimization toward desired targets.

View on arXiv PDF

Similar