Sharded Bayesian Additive Regression Trees
This work addresses scalability issues in BART for large datasets, which is an incremental improvement for practitioners in machine learning and statistics.
The paper tackles the problem of scaling Bayesian Additive Regression Trees (BART) by introducing a sharded model that partitions data using a randomization auxiliary variable and a sharding tree, fitting each partition with a sub-model, and achieves this through an intersection tree structure that optimizes sharding and modeling.
In this paper we develop the randomized Sharded Bayesian Additive Regression Trees (SBT) model. We introduce a randomization auxiliary variable and a sharding tree to decide partitioning of data, and fit each partition component to a sub-model using Bayesian Additive Regression Tree (BART). By observing that the optimal design of a sharding tree can determine optimal sharding for sub-models on a product space, we introduce an intersection tree structure to completely specify both the sharding and modeling using only tree structures. In addition to experiments, we also derive the theoretical optimal weights for minimizing posterior contractions and prove the worst-case complexity of SBT.