CVFeb 27, 2025

Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models

arXiv:2502.20134v49 citationsh-index: 25CVPR
Originality Highly original
AI Analysis

This work addresses the need for interpretable AI in vision tasks, offering a novel method for debugging and exploring models, though it is incremental in building upon concept bottleneck models.

The paper tackled the problem of making deep neural networks explainable by developing a framework that transforms any vision network into an interpretable model using spatially-aware concept bottleneck layers, resulting in improved classification performance and high-quality spatial explanations that outperform existing methods on tasks like zero-shot segmentation.

Modern deep neural networks have now reached human-level performance across a variety of tasks. However, unlike humans they lack the ability to explain their decisions by showing where and telling what concepts guided them. In this work, we present a unified framework for transforming any vision neural network into a spatially and conceptually interpretable model. We introduce a spatially-aware concept bottleneck layer that projects "black-box" features of pre-trained backbone models into interpretable concept maps, without requiring human labels. By training a classification layer over this bottleneck, we obtain a self-explaining model that articulates which concepts most influenced its prediction, along with heatmaps that ground them in the input image. Accordingly, we name this method "Spatially-Aware and Label-Free Concept Bottleneck Model" (SALF-CBM). Our results show that the proposed SALF-CBM: (1) Outperforms non-spatial CBM methods, as well as the original backbone, on a variety of classification tasks; (2) Produces high-quality spatial explanations, outperforming widely used heatmap-based methods on a zero-shot segmentation task; (3) Facilitates model exploration and debugging, enabling users to query specific image regions and refine the model's decisions by locally editing its concept maps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes