MLLGSep 26, 2025

Localized Uncertainty Quantification in Random Forests via Proximities

arXiv:2509.22928v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of assessing prediction reliability in high-stakes scenarios for users of random forests, though it is incremental as it builds on existing proximity-based techniques.

The paper tackles localized uncertainty quantification in random forests by using proximities to form localized distributions of out-of-bag errors, resulting in adjustable prediction intervals for regression and trust scores for classification that enhance accuracy-rejection AUC scores compared to competing methods.

In machine learning, uncertainty quantification helps assess the reliability of model predictions, which is important in high-stakes scenarios. Traditional approaches often emphasize predictive accuracy, but there is a growing focus on incorporating uncertainty measures. This paper addresses localized uncertainty quantification in random forests. While current methods often rely on quantile regression or Monte Carlo techniques, we propose a new approach using naturally occurring test sets and similarity measures (proximities) typically viewed as byproducts of random forests. Specifically, we form localized distributions of OOB errors around nearby points, defined using the proximities, to create prediction intervals for regression and trust scores for classification. By varying the number of nearby points, our intervals can be adjusted to achieve the desired coverage while retaining the flexibility that reflects the certainty of individual predictions. For classification, excluding points identified as unclassifiable by our method generally enhances the accuracy of the model and provides higher accuracy-rejection AUC scores than competing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes