Accelerated Componentwise Gradient Boosting using Efficient Data Representation and Momentum-based Optimization
This work addresses efficiency issues for researchers using interpretable boosting models, though it is incremental as it builds on existing methods.
The paper tackles the computational complexity of componentwise gradient boosting by introducing feature discretization and Nesterov momentum, achieving improvements in runtime and memory consumption while maintaining state-of-the-art performance on simulated and real-world datasets.
Componentwise boosting (CWB), also known as model-based boosting, is a variant of gradient boosting that builds on additive models as base learners to ensure interpretability. CWB is thus often used in research areas where models are employed as tools to explain relationships in data. One downside of CWB is its computational complexity in terms of memory and runtime. In this paper, we propose two techniques to overcome these issues without losing the properties of CWB: feature discretization of numerical features and incorporating Nesterov momentum into functional gradient descent. As the latter can be prone to early overfitting, we also propose a hybrid approach that prevents a possibly diverging gradient descent routine while ensuring faster convergence. We perform extensive benchmarks on multiple simulated and real-world data sets to demonstrate the improvements in runtime and memory consumption while maintaining state-of-the-art estimation and prediction performance.