MELGMLMay 15, 2024

Wasserstein Gradient Boosting: A Framework for Distribution-Valued Supervised Learning

arXiv:2405.09536v25 citationsh-index: 1NIPS
AI Analysis

This work addresses uncertainty quantification in machine learning for applications requiring distributional estimates, though it is an incremental extension of gradient boosting.

The paper tackles the problem of distribution-valued supervised learning by extending gradient boosting to handle outputs that are probability distributions, using Wasserstein gradients. The result is a method that empirically shows superior performance in probabilistic prediction compared to existing uncertainty quantification techniques.

Gradient boosting is a sequential ensemble method that fits a new weaker learner to pseudo residuals at each iteration. We propose Wasserstein gradient boosting, a novel extension of gradient boosting that fits a new weak learner to alternative pseudo residuals that are Wasserstein gradients of loss functionals of probability distributions assigned at each input. It solves distribution-valued supervised learning, where the output values of the training dataset are probability distributions for each input. In classification and regression, a model typically returns, for each input, a point estimate of a parameter of a noise distribution specified for a response variable, such as the class probability parameter of a categorical distribution specified for a response label. A main application of Wasserstein gradient boosting in this paper is tree-based evidential learning, which returns a distributional estimate of the response parameter for each input. We empirically demonstrate the superior performance of the probabilistic prediction by Wasserstein gradient boosting in comparison with existing uncertainty quantification methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes