Computing Conditional Shapley Values Using Tabular Foundation Models
This addresses the problem of slow Shapley value computation for AI practitioners, offering a faster method with competitive accuracy, though it is incremental as it builds on existing models.
The paper tackled the computational expense of computing Shapley values for explainable AI by using tabular foundation models like TabPFN to approximate conditional expectations without retraining, resulting in best or near-best performance in most cases at a fraction of the runtime.
Shapley values have become a cornerstone of explainable AI, but they are computationally expensive to use, especially when features are dependent. Evaluating them requires approximating a large number of conditional expectations, either via Monte Carlo integration or regression. Until recently it has not been possible to fully exploit deep learning for the regression approach, because retraining for each conditional expectation takes too long. Tabular foundation models such as TabPFN overcome this computational hurdle by leveraging in-context learning, so each conditional expectation can be approximated without any re-training. In this paper, we compute Shapley values with multiple variants of TabPFN and compare their performance with state-of-the-art methods on both simulated and real datasets. In most cases, TabPFN yields the best performance; where it does not, it is only marginally worse than the best method, at a fraction of the runtime. We discuss further improvements and how tabular foundation models can be better adapted specifically for conditional Shapley value estimation.