Treeging
This addresses the need for robust prediction models in spatial and space-time data analysis, offering a hybrid approach that mitigates weaknesses of existing methods, though it appears incremental in combining known techniques.
The paper tackles the problem of spatial and space-time prediction by combining regression trees and kriging into an ensemble algorithm called Treeging, which performs well across varied simulations and outperforms competitors like ordinary kriging and random forest in predicting atmospheric pollutants.
Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm. In so doing, it combines the strengths of the two primary types of spatial and space-time prediction models: (1) models with flexible mean structures (often machine learning algorithms) that assume independently distributed data, and (2) kriging or Gaussian Process (GP) prediction models with rich covariance structures but simple mean structures. We investigate the predictive accuracy of treeging across a thorough and widely varied battery of spatial and space-time simulation scenarios, comparing it to ordinary kriging, random forest and ensembles of ordinary kriging base learners. Treeging performs well across the board, whereas kriging suffers when dependence is weak or in the presence of spurious covariates, and random forest suffers when the covariates are less informative. Treeging also outperforms these competitors in predicting atmospheric pollutants (ozone and PM$_{2.5}$) in several case studies. We examine sensitivity to tuning parameters (number of base learners and training data sampling proportion), finding they follow the familiar intuition of their random forest counterparts. We include a discussion of scaleability, noting that any covariance approximation techniques that expedite kriging (GP) may be similarly applied to expedite treeging.