LGSep 24, 2021

Learning Multi-Layered GBDT Via Back Propagation

arXiv:2109.11863v2
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in GBDT for tabular data by enabling multi-layered learning, potentially benefiting machine learning practitioners, but it is incremental as it builds on existing GBDT and neural network concepts.

The paper tackles the problem of enabling multi-layered representation learning in gradient boosting decision trees (GBDT), which are non-differentiable, by proposing a framework that approximates gradients via linear regression to allow back propagation. Experiments show the method improves performance and representation ability, offering a new approach for deep tree-based learning.

Deep neural networks are able to learn multi-layered representation via back propagation (BP). Although the gradient boosting decision tree (GBDT) is effective for modeling tabular data, it is non-differentiable with respect to its input, thus suffering from learning multi-layered representation. In this paper, we propose a framework of learning multi-layered GBDT via BP. We approximate the gradient of GBDT based on linear regression. Specifically, we use linear regression to replace the constant value at each leaf ignoring the contribution of individual samples to the tree structure. In this way, we estimate the gradient for intermediate representations, which facilitates BP for multi-layered GBDT. Experiments show the effectiveness of the proposed method in terms of performance and representation ability. To the best of our knowledge, this is the first work of optimizing multi-layered GBDT via BP. This work provides a new possibility of exploring deep tree based learning and combining GBDT with neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes