MLLGFeb 25

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

arXiv:2602.22432v11 citationsh-index: 4
Originality Highly original
AI Analysis

This addresses the need for efficient and adaptive uncertainty estimation in tabular regression for users of gradient-boosted trees, offering a model-native solution without retraining or auxiliary models.

The paper tackled the problem of uncertainty quantification for gradient-boosted trees by proposing LoBoost, a local conformal prediction method that reuses the ensemble's leaf structure to define calibration groups, resulting in competitive interval quality, improved test MSE on most datasets, and large calibration speedups.

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes