MLLGMar 12, 2019

Unbiased Measurement of Feature Importance in Tree-Based Methods

arXiv:1903.05179v286 citations
Originality Incremental advance
AI Analysis

This addresses a methodological issue for researchers and practitioners using tree-based models, but it is incremental as it modifies an existing approach.

The authors tackled the bias in split-improvement variable importance measures in tree-based methods like Random Forests, which favor features with more potential splits, and showed that correcting this bias using out-of-sample data yields improved summaries and screening tools.

We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes