Multivariate Bernoulli Hoeffding Decomposition: From Theory to Sensitivity Analysis
This work offers an interpretable framework for sensitivity analysis in decision-support problems with binary features, though it is incremental as it builds on existing decomposition theory for correlated inputs.
The paper tackled the problem of interpreting predictive models with correlated Bernoulli inputs by providing a complete analytical characterization of a generalized Hoeffding decomposition, showing that it yields closed-form representations and enables direct derivation of sensitivity measures like Sobol' indices and Shapley effects.
Understanding the behavior of predictive models with random inputs can be achieved through functional decompositions into sub-models that capture interpretable effects of input groups. Building on recent advances in uncertainty quantification, the existence and uniqueness of a generalized Hoeffding decomposition have been established for correlated input variables, using oblique projections onto suitable functional subspaces. This work focuses on the case of Bernoulli inputs and provides a complete analytical characterization of the decomposition. We show that, in this discrete setting, the associated subspaces are one-dimensional and that the decomposition admits a closed-form representation. One of the main contributions of this study is to generalize the classical Fourier--Walsh--Hadamard decomposition for pseudo-Boolean functions to the correlated case, yielding an oblique version when the underlying distribution is not a product measure, and recovering the standard orthogonal form when independence holds. This explicit structure offers a fully interpretable framework, clarifying the contribution of each input combination and theoretically enabling model reverse engineering. From this formulation, explicit sensitivity measures-such as Sobol' indices and Shapley effects-can be directly derived. Numerical experiments illustrate the practical interest of the approach for decision-support problems involving binary features. The paper concludes with perspectives on extending the methodology to high-dimensional settings and to models involving inputs with finite, non-binary support.