Expected Shapley-Like Scores of Boolean Functions: Complexity and Applications to Probabilistic Databases
This work addresses the complexity of computing Shapley-like scores for explainable AI in probabilistic databases, offering incremental improvements in tractability for specific cases.
The paper tackles the problem of computing expected Shapley values for Boolean functions in probabilistic settings, showing that these computations are interreducible with expected values of Boolean functions in polynomial time, and it presents a polynomial-time algorithm for deterministic decomposable circuits with experimental validation in a database system.
Shapley values, originating in game theory and increasingly prominent in explainable AI, have been proposed to assess the contribution of facts in query answering over databases, along with other similar power indices such as Banzhaf values. In this work we adapt these Shapley-like scores to probabilistic settings, the objective being to compute their expected value. We show that the computations of expected Shapley values and of the expected values of Boolean functions are interreducible in polynomial time, thus obtaining the same tractability landscape. We investigate the specific tractable case where Boolean functions are represented as deterministic decomposable circuits, designing a polynomial-time algorithm for this setting. We present applications to probabilistic databases through database provenance, and an effective implementation of this algorithm within the ProvSQL system, which experimentally validates its feasibility over a standard benchmark.