Accurate estimation of feature importance faithfulness for tree models
This work addresses the need for reliable feature attribution in machine learning interpretability, particularly for tree-based models, though it appears incremental as it builds on existing perturbation-based metrics.
The paper tackles the problem of accurately estimating feature importance faithfulness for tree models by introducing a perturbation-based metric called PGI squared, which can be computed efficiently without Monte Carlo sampling, and experiments show it may outperform SHAP in identifying globally important features.
In this paper, we consider a perturbation-based metric of predictive faithfulness of feature rankings (or attributions) that we call PGI squared. When applied to decision tree-based regression models, the metric can be computed accurately and efficiently for arbitrary independent feature perturbation distributions. In particular, the computation does not involve Monte Carlo sampling that has been typically used for computing similar metrics and which is inherently prone to inaccuracies. Moreover, we propose a method of ranking features by their importance for the tree model's predictions based on PGI squared. Our experiments indicate that in some respects, the method may identify the globally important features better than the state-of-the-art SHAP explainer