LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees
This addresses the need for efficient and adaptive uncertainty estimation in tabular regression for users of gradient-boosted trees, offering a model-native solution without retraining or auxiliary models.
The paper tackled the problem of uncertainty quantification for gradient-boosted trees by proposing LoBoost, a local conformal prediction method that reuses the ensemble's leaf structure to define calibration groups, resulting in competitive interval quality, improved test MSE on most datasets, and large calibration speedups.
Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.