Multivariate Probabilistic Regression with Natural Gradient Boosting
This addresses the need for joint uncertainty measures in multivariate regression problems, which is incremental as it extends existing probabilistic methods to handle multiple targets more effectively.
The paper tackles the problem of joint probabilistic regression for multivariate targets, such as 2D velocity vectors, by proposing a Natural Gradient Boosting (NGBoost) approach that models conditional parameters nonparametrically, demonstrating competitive performance in simulations and a case study with oceanographic data.
Many single-target regression problems require estimates of uncertainty along with the point predictions. Probabilistic regression algorithms are well-suited for these tasks. However, the options are much more limited when the prediction target is multivariate and a joint measure of uncertainty is required. For example, in predicting a 2D velocity vector a joint uncertainty would quantify the probability of any vector in the plane, which would be more expressive than two separate uncertainties on the x- and y- components. To enable joint probabilistic regression, we propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution. Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches. We demonstrate these claims in simulation and with a case study predicting two-dimensional oceanographic velocity data. An implementation of our method is available at https://github.com/stanfordmlgroup/ngboost.