AINov 3, 2025

llmSHAP: A Principled Approach to LLM Explainability

arXiv:2511.01311v11 citationsh-index: 12
Originality Synthesis-oriented
AI Analysis

This work addresses explainability for LLM-based decision support, but it is incremental as it adapts existing Shapley methods to stochastic contexts.

The paper tackles the challenge of applying Shapley value-based feature attribution to stochastic large language models (LLMs) in decision support systems, analyzing when principles can be guaranteed and highlighting trade-offs in speed, accuracy, and principle attainment.

Feature attribution methods help make machine learning-based inference explainable by determining how much one or several features have contributed to a model's output. A particularly popular attribution method is based on the Shapley value from cooperative game theory, a measure that guarantees the satisfaction of several desirable principles, assuming deterministic inference. We apply the Shapley value to feature attribution in large language model (LLM)-based decision support systems, where inference is, by design, stochastic (non-deterministic). We then demonstrate when we can and cannot guarantee Shapley value principle satisfaction across different implementation variants applied to LLM-based decision support, and analyze how the stochastic nature of LLMs affects these guarantees. We also highlight trade-offs between explainable inference speed, agreement with exact Shapley value attributions, and principle attainment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes