Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution
This work addresses uncertainty estimation in gradient boosting for regression, which is incremental as it builds on existing boosting methods with regularization and theoretical guarantees.
The paper tackles the problem of uncertainty quantification in gradient boosting by introducing Boulevard, a regularized stochastic gradient boosting framework that converges as the number of trees increases and provides a central limit theorem for predictions, supported by simulation and real-world examples showing predictive accuracy.
This paper examines a novel gradient boosting framework for regression. We regularize gradient boosted trees by introducing subsampling and employ a modified shrinkage algorithm so that at every boosting stage the estimate is given by an average of trees. The resulting algorithm, titled Boulevard, is shown to converge as the number of trees grows. We also demonstrate a central limit theorem for this limit, allowing a characterization of uncertainty for predictions. A simulation study and real world examples provide support for both the predictive accuracy of the model and its limiting behavior.