ML LGDec 16, 2019

A Unified Framework for Random Forest Prediction Error Estimation

arXiv:1912.07435v532 citations

Originality Incremental advance

AI Analysis

This work addresses uncertainty quantification in random forest predictions, which is important for practitioners in fields like statistics and machine learning, but it appears incremental as it builds on existing methods with specific improvements.

The authors tackled the problem of estimating prediction error for random forests by introducing a unified framework based on a novel estimator of the conditional prediction error distribution function, enabling plug-in estimation of uncertainty metrics like mean squared errors and quantiles, with simulations showing competitive or superior performance for prediction intervals in some settings.

We introduce a unified framework for random forest prediction error estimation based on a novel estimator of the conditional prediction error distribution function. Our framework enables simple plug-in estimation of key prediction uncertainty metrics, including conditional mean squared prediction errors, conditional biases, and conditional quantiles, for random forests and many variants. Our approach is especially well-adapted for prediction interval estimation; we show via simulations that our proposed prediction intervals are competitive with, and in some settings outperform, existing methods. To establish theoretical grounding for our framework, we prove pointwise uniform consistency of a more stringent version of our estimator of the conditional prediction error distribution function. The estimators introduced here are implemented in the R package forestError.

View on arXiv PDF

Similar