A Distance-based Framework for Gaussian Processes over Probability Distributions
For practitioners needing to model data where inputs are distributions (e.g., noisy measurements or distributional data), this provides a principled extension of GPs.
This paper extends Gaussian process regression to handle inputs that are probability distributions, using distance-based kernels. A numerical example demonstrates the framework's feasibility.
Gaussian processes constitute a very powerful and well-understood method for non-parametric regression and classification. In the classical framework, the training data consists of deterministic vector-valued inputs and the corresponding (noisy) measurements whose joint distribution is assumed to be Gaussian. In many practical applications, however, the inputs are either noisy, i.e., each input is a vector-valued sample from an unknown probability distribution, or the probability distributions are the inputs. In this paper, we address Gaussian process regression with inputs given in form of probability distributions and propose a framework that is based on distances between such inputs. To this end, we review different admissible distance measures and provide a numerical example that demonstrates our framework.