MLLGMEFeb 9, 2020

Stochastic tree ensembles for regularized nonlinear regression

arXiv:2002.03375v433 citations
AI Analysis

This provides a more efficient and accurate tool for nonlinear regression tasks, with potential applications in data analysis and machine learning, though it appears incremental as it builds on existing tree-based methods.

The paper tackles the problem of nonlinear regression by developing XBART, a stochastic tree ensemble method that combines regularization and stochastic search for improved performance, achieving faster and more accurate results than XGBoost in many settings.

This paper develops a novel stochastic tree ensemble method for nonlinear regression, which we refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, we demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). We also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, we demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes