LGMar 30, 2021

Generalized Linear Tree Space Nearest Neighbor

arXiv:2103.16408v11.6

Originality Incremental advance

AI Analysis

This is an incremental improvement for machine learning practitioners seeking alternatives to ensemble methods like Random Forest.

The authors tackled the problem of improving predictive accuracy by stacking decision trees through projection into an ordered time split out-of-fold one nearest neighbor space and combining predictions with a linear model, resulting in GLTSNN being competitive with Random Forest in Mean Squared Error on several datasets.

We present a novel method of stacking decision trees by projection into an ordered time split out-of-fold (OOF) one nearest neighbor (1NN) space. The predictions of these one nearest neighbors are combined through a linear model. This process is repeated many times and averaged to reduce variance. Generalized Linear Tree Space Nearest Neighbor (GLTSNN) is competitive with respect to Mean Squared Error (MSE) compared to Random Forest (RF) on several publicly available datasets. Some of the theoretical and applied advantages of GLTSNN are discussed. We conjecture a classifier based upon the GLTSNN would have an error that is asymptotically bounded by twice the Bayes error rate like k = 1 Nearest Neighbor.

View on arXiv PDF

Similar