MLLGJul 3, 2024

Fast Calculation of Feature Contributions in Boosting Trees

arXiv:2407.03515v23 citationsh-index: 3
AI Analysis

This work addresses the need for global feature contribution analysis in machine learning models, particularly for researchers and practitioners using tree-based methods, but it is incremental as it builds on existing Shapley value decomposition techniques.

The paper tackles the problem of global evaluation of feature contributions in boosting trees by proposing Q-SHAP, an efficient algorithm that reduces computational complexity to polynomial time for calculating Shapley values with quadratic losses, resulting in improved computational efficiency and accuracy in feature-specific R² estimates.

Recently, several fast algorithms have been proposed to decompose predicted value into Shapley values, enabling individualized feature contribution analysis in tree models. While such local decomposition offers valuable insights, it underscores the need for a global evaluation of feature contributions. Although coefficients of determination ($R^2$) allow for comparative assessment of individual features, individualizing $R^2$ is challenged by the underlying quadratic losses. To address this, we propose Q-SHAP, an efficient algorithm that reduces the computational complexity of calculating Shapley values for quadratic losses to polynomial time. Our simulations show that Q-SHAP not only improves computational efficiency but also enhances the accuracy of feature-specific $R^2$ estimates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes