Out-of-Core GPU Gradient Boosting
This addresses a bottleneck for machine learning practitioners needing to train on large datasets with limited GPU memory, though it is incremental as it adapts an existing method to new hardware constraints.
The paper tackles the problem of GPU memory limitations for training large datasets by introducing an out-of-core GPU gradient boosting algorithm in XGBoost, enabling much larger datasets to fit on a given GPU without degrading model accuracy or training time.
GPU-based algorithms have greatly accelerated many machine learning methods; however, GPU memory is typically smaller than main memory, limiting the size of training data. In this paper, we describe an out-of-core GPU gradient boosting algorithm implemented in the XGBoost library. We show that much larger datasets can fit on a given GPU, without degrading model accuracy or training time. To the best of our knowledge, this is the first out-of-core GPU implementation of gradient boosting. Similar approaches can be applied to other machine learning algorithms