MECOMLAug 13, 2020

An information criterion for automatic gradient tree boosting

arXiv:2008.05926v16 citations
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency of hyperparameter tuning in gradient boosting for practitioners, though it is incremental as it builds on existing boosting methods.

The paper tackled the problem of efficiently selecting model complexity and number of trees in gradient tree boosting by proposing an information-theoretic criterion based on the optimism of leaf splitting, eliminating the need for cross-validation. This approach achieved speedups of 10 to 1400 times compared to xgboost while maintaining similar predictive performance in terms of test loss.

An information theoretic approach to learning the complexity of classification and regression trees and the number of trees in gradient tree boosting is proposed. The optimism (test loss minus training loss) of the greedy leaf splitting procedure is shown to be the maximum of a Cox-Ingersoll-Ross process, from which a generalization-error based information criterion is formed. The proposed procedure allows fast local model selection without cross validation based hyper parameter tuning, and hence efficient and automatic comparison among the large number of models performed during each boosting iteration. Relative to xgboost, speedups on numerical experiments ranges from around 10 to about 1400, at similar predictive-power measured in terms of test-loss.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes