LG MEMay 19

A Family of Divergence Measures for Evaluating the Reconstruction Quality of Explainable Ensemble Trees

Massimo Aria, Agostino Gnasso, Carmela Iorio

arXiv:2605.196181.9

AI Analysis

For researchers using surrogate models to explain ensemble learners, this provides a more sensitive diagnostic tool to identify where and why reconstruction fails.

The paper proposes a family of divergence measures, centered on the normalized Loss of Interpretability (nLoI), to evaluate reconstruction quality of explainable ensemble trees, demonstrating that these measures detect fidelity gradients invisible to correlation-based alternatives with exact Type I error control in simulations.

Validating interpretable surrogate models for ensemble learners requires measuring agreement between the ensemble's internal representation and its surrogate approximation, rather than mere association. Correlation-based approaches are scale-invariant and fail to detect systematic discrepancies in co-occurrence structure. We propose a statistical framework grounded in the agreement-association distinction, centered on the normalized Loss of Interpretability (nLoI). Rooted in the Cressie-Read power divergence family with lambda equal to 2, the nLoI admits a closed-form decomposition into within-node and between-node components, providing a unique diagnostic capability to identify precisely where and why reconstruction fails. The framework incorporates four complementary measures capturing distinct structural facets of approximation quality. A unified permutation testing procedure delivers valid inference for all measures within a single resampling pass. Theoretical properties, including boundedness and symmetry, are established for each metric. Monte Carlo simulations and empirical evaluations confirm exact Type I error control and demonstrate that these measures detect reconstruction fidelity gradients invisible to correlation-based alternatives. The framework is developed and illustrated in the context of Explainable Ensemble Trees (E2Tree), and empirical evaluation on three benchmark datasets illustrates the practical utility of the framework.

View on arXiv PDF

Similar