LGMar 9, 2016

XGBoost: A Scalable Tree Boosting System

arXiv:1603.02754v354879 citations
Originality Highly original
AI Analysis

It provides a highly effective and widely used solution for data scientists to handle large-scale machine learning tasks, with broad impact across various domains.

The paper tackled the problem of scaling tree boosting to large datasets by introducing XGBoost, a scalable system that achieved state-of-the-art results on many challenges using far fewer resources, scaling beyond billions of examples.

Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

Code Implementations28 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes