Gonzalo Rios

ML
5papers
112citations
Novelty53%
AI Score25

5 Papers

MLJan 30, 2020
Transport Gaussian Processes for Regression

Gonzalo Rios

Gaussian process (GP) priors are non-parametric generative models with appealing modelling properties for Bayesian inference: they can model non-linear relationships through noisy observations, have closed-form expressions for training and inference, and are governed by interpretable hyperparameters. However, GP models rely on Gaussianity, an assumption that does not hold in several real-world scenarios, e.g., when observations are bounded or have extreme-value dependencies, a natural phenomenon in physics, finance and social sciences. Although beyond-Gaussian stochastic processes have caught the attention of the GP community, a principled definition and rigorous treatment is still lacking. In this regard, we propose a methodology to construct stochastic processes, which include GPs, warped GPs, Student-t processes and several others under a single unified approach. We also provide formulas and algorithms for training and inference of the proposed models in the regression problem. Our approach is inspired by layers-based models, where each proposed layer changes a specific property over the generated stochastic process. That, in turn, allows us to push-forward a standard Gaussian white noise prior towards other more expressive stochastic processes, for which marginals and copulas need not be Gaussian, while retaining the appealing properties of GPs. We validate the proposed model through experiments with real-world data.

MLJun 23, 2019
Compositionally-Warped Gaussian Processes

Gonzalo Rios, Felipe Tobar

The Gaussian process (GP) is a nonparametric prior distribution over functions indexed by time, space, or other high-dimensional index set. The GP is a flexible model yet its limitation is given by its very nature: it can only model Gaussian marginal distributions. To model non-Gaussian data, a GP can be warped by a nonlinear transformation (or warping) as performed by warped GPs (WGPs) and more computationally-demanding alternatives such as Bayesian WGPs and deep GPs. However, the WGP requires a numerical approximation of the inverse warping for prediction, which increases the computational complexity in practice. To sidestep this issue, we construct a novel class of warpings consisting of compositions of multiple elementary functions, for which the inverse is known explicitly. We then propose the compositionally-warped GP (CWGP), a non-Gaussian generative model whose expressiveness follows from its deep compositional architecture, and its computational efficiency is guaranteed by the analytical inverse warping. Experimental validation using synthetic and real-world datasets confirms that the proposed CWGP is robust to the choice of warpings and provides more accurate point predictions, better trained models and shorter computation times than WGP.

MLMay 28, 2018
Bayesian Learning with Wasserstein Barycenters

Julio Backhoff-Veraguas, Joaquin Fontbona, Gonzalo Rios et al.

We introduce and study a novel model-selection strategy for Bayesian learning, based on optimal transport, along with its associated predictive posterior law: the Wasserstein population barycenter of the posterior law over models. We first show how this estimator, termed Bayesian Wasserstein barycenter (BWB), arises naturally in a general, parameter-free Bayesian model-selection framework, when the considered Bayesian risk is the Wasserstein distance. Examples are given, illustrating how the BWB extends some classic parametric and non-parametric selection strategies. Furthermore, we also provide explicit conditions granting the existence and statistical consistency of the BWB, and discuss some of its general and specific properties, providing insights into its advantages compared to usual choices, such as the model average estimator. Finally, we illustrate how this estimator can be computed using the stochastic gradient descent (SGD) algorithm in Wasserstein space introduced in a companion paper arXiv:2201.04232v2 [math.OC], and provide a numerical example for experimental validation of the proposed method.

MLMar 19, 2018
Learning non-Gaussian Time Series using the Box-Cox Gaussian Process

Gonzalo Rios, Felipe Tobar

Gaussian processes (GPs) are Bayesian nonparametric generative models that provide interpretability of hyperparameters, admit closed-form expressions for training and inference, and are able to accurately represent uncertainty. To model general non-Gaussian data with complex correlation structure, GPs can be paired with an expressive covariance kernel and then fed into a nonlinear transformation (or warping). However, overparametrising the kernel and the warping is known to, respectively, hinder gradient-based training and make the predictions computationally expensive. We remedy this issue by (i) training the model using derivative-free global-optimisation techniques so as to find meaningful maxima of the model likelihood, and (ii) proposing a warping function based on the celebrated Box-Cox transformation that requires minimal numerical approximations---unlike existing warped GP models. We validate the proposed approach by first showing that predictions can be computed analytically, and then on a learning, reconstruction and forecasting experiment using real-world datasets.

MLJul 19, 2017
Recovering Latent Signals from a Mixture of Measurements using a Gaussian Process Prior

Felipe Tobar, Gonzalo Rios, Tomás Valdivia et al.

In sensing applications, sensors cannot always measure the latent quantity of interest at the required resolution, sometimes they can only acquire a blurred version of it due the sensor's transfer function. To recover latent signals when only noisy mixed measurements of the signal are available, we propose the Gaussian process mixture of measurements (GPMM), which models the latent signal as a Gaussian process (GP) and allows us to perform Bayesian inference on such signal conditional to a set of noisy mixture of measurements. We describe how to train GPMM, that is, to find the hyperparameters of the GP and the mixing weights, and how to perform inference on the latent signal under GPMM; additionally, we identify the solution to the underdetermined linear system resulting from a sensing application as a particular case of GPMM. The proposed model is validated in the recovery of three signals: a smooth synthetic signal, a real-world heart-rate time series and a step function, where GPMM outperformed the standard GP in terms of estimation error, uncertainty representation and recovery of the spectral content of the latent signal.