CVLGJul 20, 2023

Identifying Interpretable Subspaces in Image Representations

arXiv:2307.10504v242 citationsh-index: 49Has Code
Originality Incremental advance
AI Analysis

This addresses interpretability for researchers and practitioners in computer vision, offering a method to explain and debug models, though it is incremental as it builds on existing vision-language models.

The paper tackles the problem of interpreting features in image representations by proposing FALCON, a framework that identifies human-understandable concepts for features using captions and contrastive interpretation, showing that less than 20% of features are interpretable individually but groups improve interpretability.

We propose Automatic Feature Explanation using Contrasting Concepts (FALCON), an interpretability framework to explain features of image representations. For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset (like LAION-400m) and a pre-trained vision-language model like CLIP. Each word among the captions is scored and ranked leading to a small number of shared, human-understandable concepts that closely describe the target feature. FALCON also applies contrastive interpretation using lowly activating (counterfactual) images, to eliminate spurious concepts. Although many existing approaches interpret features independently, we observe in state-of-the-art self-supervised and supervised models, that less than 20% of the representation space can be explained by individual features. We show that features in larger spaces become more interpretable when studied in groups and can be explained with high-order scoring concepts through FALCON. We discuss how extracted concepts can be used to explain and debug failures in downstream tasks. Finally, we present a technique to transfer concepts from one (explainable) representation space to another unseen representation space by learning a simple linear transformation. Code available at https://github.com/NehaKalibhat/falcon-explain.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes