CVSep 22, 2025

Chat-CBM: Towards Interactive Concept Bottleneck Models with Frozen Large Language Models

Peking U
arXiv:2509.17522v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the problem of restricted user interventions in interpretable machine learning models, particularly for researchers and practitioners in explainable AI, though it is incremental as it builds on existing CBM frameworks.

The paper tackles the limitations of traditional Concept Bottleneck Models (CBMs) by introducing Chat-CBM, which replaces score-based classifiers with a language-based classifier using frozen large language models, enabling richer interventions like concept correction and external knowledge integration. Experiments on nine datasets show that Chat-CBM achieves higher predictive performance and substantially improves user interactivity while maintaining interpretability.

Concept Bottleneck Models (CBMs) provide inherent interpretability by first predicting a set of human-understandable concepts and then mapping them to labels through a simple classifier. While users can intervene in the concept space to improve predictions, traditional CBMs typically employ a fixed linear classifier over concept scores, which restricts interventions to manual value adjustments and prevents the incorporation of new concepts or domain knowledge at test time. These limitations are particularly severe in unsupervised CBMs, where concept activations are often noisy and densely activated, making user interventions ineffective. We introduce Chat-CBM, which replaces score-based classifiers with a language-based classifier that reasons directly over concept semantics. By grounding prediction in the semantic space of concepts, Chat-CBM preserves the interpretability of CBMs while enabling richer and more intuitive interventions, such as concept correction, addition or removal of concepts, incorporation of external knowledge, and high-level reasoning guidance. Leveraging the language understanding and few-shot capabilities of frozen large language models, Chat-CBM extends the intervention interface of CBMs beyond numerical editing and remains effective even in unsupervised settings. Experiments on nine datasets demonstrate that Chat-CBM achieves higher predictive performance and substantially improves user interactivity while maintaining the concept-based interpretability of CBMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes