ML LGMar 12, 2019

Unbiased Measurement of Feature Importance in Tree-Based Methods

arXiv:1903.05179v219.086 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a methodological issue for researchers and practitioners using tree-based models, but it is incremental as it modifies an existing approach.

The authors tackled the bias in split-improvement variable importance measures in tree-based methods like Random Forests, which favor features with more potential splits, and showed that correcting this bias using out-of-sample data yields improved summaries and screening tools.

We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.

View on arXiv PDF Code

Similar