LGAICVDec 18, 2024

Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts

arXiv:2412.14097v110 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses interpretability challenges in critical domains like healthcare and finance, offering an incremental improvement over existing concept bottleneck models for deployment in shifting environments.

The paper tackles the problem of making foundation models interpretable under distribution shifts by proposing an adaptive concept bottleneck framework that dynamically adjusts concept vectors and prediction layers using unlabeled target data, resulting in up to 28% accuracy improvement and better alignment with test data.

Advancements in foundation models (FMs) have led to a paradigm shift in machine learning. The rich, expressive feature representations from these pre-trained, large-scale FMs are leveraged for multiple downstream tasks, usually via lightweight fine-tuning of a shallow fully-connected network following the representation. However, the non-interpretable, black-box nature of this prediction pipeline can be a challenge, especially in critical domains such as healthcare, finance, and security. In this paper, we explore the potential of Concept Bottleneck Models (CBMs) for transforming complex, non-interpretable foundation models into interpretable decision-making pipelines using high-level concept vectors. Specifically, we focus on the test-time deployment of such an interpretable CBM pipeline "in the wild", where the input distribution often shifts from the original training distribution. We first identify the potential failure modes of such a pipeline under different types of distribution shifts. Then we propose an adaptive concept bottleneck framework to address these failure modes, that dynamically adapts the concept-vector bank and the prediction layer based solely on unlabeled data from the target domain, without access to the source (training) dataset. Empirical evaluations with various real-world distribution shifts show that our adaptation method produces concept-based interpretations better aligned with the test data and boosts post-deployment accuracy by up to 28%, aligning the CBM performance with that of non-interpretable classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes