LG AI MLOct 21, 2024

Bayesian Concept Bottleneck Models with LLM Priors

Jean Feng, Avni Kothari, Luke Zier, Chandan Singh, Yan Shuo Tan

arXiv:2410.15555v215.013 citationsh-index: 3Has Code

Originality Highly original

AI Analysis

It addresses the interpretability-accuracy tradeoff in machine learning for practitioners needing transparent models, though it is incremental as it builds on existing CBMs with LLM integration.

This paper tackles the tradeoff between interpretability and accuracy in Concept Bottleneck Models by proposing BC-LLM, which uses LLMs as priors and extraction mechanisms in a Bayesian framework to search over infinite concepts, resulting in outperforming interpretable baselines and black-box models in some settings with faster convergence and better robustness.

Concept Bottleneck Models (CBMs) have been proposed as a compromise between white-box and black-box models, aiming to achieve interpretability without sacrificing accuracy. The standard training procedure for CBMs is to predefine a candidate set of human-interpretable concepts, extract their values from the training data, and identify a sparse subset as inputs to a transparent prediction model. However, such approaches are often hampered by the tradeoff between exploring a sufficiently large set of concepts versus controlling the cost of obtaining concept extractions, resulting in a large interpretability-accuracy tradeoff. This work investigates a novel approach that sidesteps these challenges: BC-LLM iteratively searches over a potentially infinite set of concepts within a Bayesian framework, in which Large Language Models (LLMs) serve as both a concept extraction mechanism and prior. Even though LLMs can be miscalibrated and hallucinate, we prove that BC-LLM can provide rigorous statistical inference and uncertainty quantification. Across image, text, and tabular datasets, BC-LLM outperforms interpretable baselines and even black-box models in certain settings, converges more rapidly towards relevant concepts, and is more robust to out-of-distribution samples.

View on arXiv PDF Code

Similar