LG MLSep 16, 2021

WildWood: a new Random Forest algorithm

Stéphane Gaïffas, Ibrahim Merad, Yiyang Yu

arXiv:2109.08010v21.6h-index: 14Has Code

Originality Incremental advance

AI Analysis

This is an incremental improvement for supervised learning practitioners seeking more accurate ensemble methods.

The authors tackled the problem of improving Random Forest predictions by introducing WildWood, which aggregates predictions from all possible subtrees using exponential weights computed over out-of-bag samples, resulting in a fast and competitive algorithm compared to standard RF and gradient boosting.

We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type. While standard RF algorithms use bootstrap out-of-bag samples to compute out-of-bag scores, WW uses these samples to produce improved predictions given by an aggregation of the predictions of all possible subtrees of each fully grown tree in the forest. This is achieved by aggregation with exponential weights computed over out-of-bag samples, that are computed exactly and very efficiently thanks to an algorithm called context tree weighting. This improvement, combined with a histogram strategy to accelerate split finding, makes WW fast and competitive compared with other well-established ensemble methods, such as standard RF and extreme gradient boosting algorithms.

View on arXiv PDF Code

Similar