MLLGJun 15, 2021

RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests

arXiv:2106.08217v2
AI Analysis

This work provides a tool for researchers and practitioners in statistics and machine learning to improve uncertainty quantification in forest-based models, though it is incremental as it builds on existing methods.

The authors tackled the problem of quantifying uncertainty in predictions from random forests and boosted forests by developing an R package, RFpredInterval, which integrates 16 methods for building prediction intervals, including a new method for boosted forests. The results from simulations and real data analyses show that the proposed method is very competitive and globally outperforms ten existing methods.

Like many predictive models, random forests provide point predictions for new observations. Besides the point prediction, it is important to quantify the uncertainty in the prediction. Prediction intervals provide information about the reliability of the point predictions. We have developed a comprehensive R package, RFpredInterval, that integrates 16 methods to build prediction intervals with random forests and boosted forests. The set of methods implemented in the package includes a new method to build prediction intervals with boosted forests (PIBF) and 15 method variations to produce prediction intervals with random forests, as proposed by Roy and Larocque (2020). We perform an extensive simulation study and apply real data analyses to compare the performance of the proposed method to ten existing methods for building prediction intervals with random forests. The results show that the proposed method is very competitive and, globally, outperforms competing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes