LGMLJun 29, 2018

XGBoost: Scalable GPU Accelerated Learning

arXiv:1806.11248v150 citationsHas Code
Originality Incremental advance
AI Analysis

This enables faster training for users of gradient boosting on large datasets, though it is incremental as it builds on existing XGBoost features.

The paper tackles the problem of scaling gradient boosting to large datasets by developing a multi-GPU accelerated algorithm in XGBoost, achieving processing of 115 million training instances in under three minutes on a cloud instance.

We describe the multi-GPU gradient boosting algorithm implemented in the XGBoost library (https://github.com/dmlc/xgboost). Our algorithm allows fast, scalable training on multi-GPU systems with all of the features of the XGBoost library. We employ data compression techniques to minimise the usage of scarce GPU memory while still allowing highly efficient implementation. Using our algorithm we show that it is possible to process 115 million training instances in under three minutes on a publicly available cloud computing instance. The algorithm is implemented using end-to-end GPU parallelism, with prediction, gradient calculation, feature quantisation, decision tree construction and evaluation phases all computed on device.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes