A Fast Sampling Gradient Tree Boosting Framework
This work addresses the computational bottleneck for users of gradient boosting, offering a faster method with theoretical and empirical support, though it is incremental as it builds on existing stochastic gradient boosting.
The paper tackles the computational expensiveness of gradient tree boosting by combining it with importance sampling and a regularizer to reduce stochastic variance, achieving a 2.5x to 18x acceleration on LogitBoost and LambdaMART without significant performance loss.
As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm achieves a 2.5x--18x acceleration on two different gradient boosting algorithms (LogitBoost and LambdaMART) without appreciable performance loss.