Bayesian Approaches to Distribution Regression
This work solves the issue of handling uncertainty in group-level data for researchers in machine learning, though it is incremental as it builds on existing distribution regression methods.
The paper tackles the problem of distribution regression, where labels are at the group level, by addressing uncertainty due to varying group sizes with a Bayesian approach, improving robustness and performance, as demonstrated on toy datasets and an age prediction task.
Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level. Current approaches, however, do not propagate the uncertainty in observations due to sampling variability in the groups. This effectively assumes that small and large groups are estimated equally well, and should have equal weight in the final regression. We account for this uncertainty with a Bayesian distribution regression formalism, improving the robustness and performance of the model when group sizes vary. We frame our models in a neural network style, allowing for simple MAP inference using backpropagation to learn the parameters, as well as MCMC-based inference which can fully propagate uncertainty. We demonstrate our approach on illustrative toy datasets, as well as on a challenging problem of predicting age from images.