Post-hoc Stochastic Concept Bottleneck Models
This addresses the need for efficient, interpretable models in domains like healthcare or finance where retraining is infeasible, though it is incremental as it builds on existing CBM frameworks.
The paper tackled the problem of improving Concept Bottleneck Models (CBMs) under interventions without retraining, by introducing Post-hoc Stochastic Concept Bottleneck Models (PSCBMs) that add a covariance-prediction module to pre-trained CBMs, resulting in consistent matching or improvement in concept and target accuracy on real-world data and much better performance under interventions.
Concept Bottleneck Models (CBMs) are interpretable models that predict the target variable through high-level human-understandable concepts, allowing users to intervene on mispredicted concepts to adjust the final output. While recent work has shown that modeling dependencies between concepts can improve CBM performance, especially under interventions, such approaches typically require retraining the entire model, which may be infeasible when access to the original data or compute is limited. In this paper, we introduce Post-hoc Stochastic Concept Bottleneck Models (PSCBMs), a lightweight method that augments any pre-trained CBM with a multivariate normal distribution over concepts by adding only a small covariance-prediction module, without retraining the backbone model. We propose two training strategies and show on real-world data that PSCBMs consistently match or improve both concept and target accuracy over standard CBMs at test time. Furthermore, we show that due to the modeling of concept dependencies, PSCBMs perform much better than CBMs under interventions, while remaining far more efficient than retraining a similar stochastic model from scratch.