CVMar 24, 2025

Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification

Zequn Zeng, Yudi Su, Jianqiao Sun, Tiansheng Wen, Hao Zhang, Zhengjue Wang, Bo Chen, Hongwei Liu, Jiawei Ma

arXiv:2503.18483v14 citationsh-index: 14Has CodeCVPR

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving model generalization for high-stake applications by mitigating domain shifts, representing an incremental advance in concept-based interpretability.

The paper tackles the problem of domain-specific concepts undermining model generalization in concept-based image classification by proposing a Language-guided Concept-Erasing (LanCE) framework with a domain descriptor orthogonality (DDO) regularizer, which significantly improves out-of-distribution generalization over previous state-of-the-art models on multiple benchmarks.

Concept-based models can map black-box representations to human-understandable concepts, which makes the decision-making process more transparent and then allows users to understand the reason behind predictions. However, domain-specific concepts often impact the final predictions, which subsequently undermine the model generalization capabilities, and prevent the model from being used in high-stake applications. In this paper, we propose a novel Language-guided Concept-Erasing (LanCE) framework. In particular, we empirically demonstrate that pre-trained vision-language models (VLMs) can approximate distinct visual domain shifts via domain descriptors while prompting large Language Models (LLMs) can easily simulate a wide range of descriptors of unseen visual domains. Then, we introduce a novel plug-in domain descriptor orthogonality (DDO) regularizer to mitigate the impact of these domain-specific concepts on the final predictions. Notably, the DDO regularizer is agnostic to the design of concept-based models and we integrate it into several prevailing models. Through evaluation of domain generalization on four standard benchmarks and three newly introduced benchmarks, we demonstrate that DDO can significantly improve the out-of-distribution (OOD) generalization over the previous state-of-the-art concept-based models.Our code is available at https://github.com/joeyz0z/LanCE.

View on arXiv PDF Code

Similar