PRApr 13, 2013
Optimal uncertainty quantification for legacy data observations of Lipschitz functionsT. J. Sullivan, M. McKerns, D. Meyer et al.
We consider the problem of providing optimal uncertainty quantification (UQ) --- and hence rigorous certification --- for partially-observed functions. We present a UQ framework within which the observations may be small or large in number, and need not carry information about the probability distribution of the system in operation. The UQ objectives are posed as optimization problems, the solutions of which are optimal bounds on the quantities of interest; we consider two typical settings, namely parameter sensitivities (McDiarmid diameters) and output deviation (or failure) probabilities. The solutions of these optimization problems depend non-trivially (even non-monotonically and discontinuously) upon the specified legacy data. Furthermore, the extreme values are often determined by only a few members of the data set; in our principal physically-motivated example, the bounds are determined by just 2 out of 32 data points, and the remainder carry no information and could be neglected without changing the final answer. We propose an analogue of the simplex algorithm from linear programming that uses these observations to offer efficient and rigorous UQ for high-dimensional systems with high-cardinality legacy data. These findings suggest natural methods for selecting optimal (maximally informative) next experiments.
MLAug 2, 2024
Autoencoders in Function SpaceJustin Bunker, Mark Girolami, Hefin Lambley et al.
Autoencoders have found widespread application in both their original deterministic form and in their variational formulation (VAEs). In scientific applications and in image processing it is often of interest to consider data that are viewed as functions; while discretisation (of differential equations arising in the sciences) or pixellation (of images) renders problems finite dimensional in practice, conceiving first of algorithms that operate on functions, and only then discretising or pixellating, leads to better algorithms that smoothly operate between resolutions. In this paper function-space versions of the autoencoder (FAE) and variational autoencoder (FVAE) are introduced, analysed, and deployed. Well-definedness of the objective governing VAEs is a subtle issue, particularly in function space, limiting applicability. For the FVAE objective to be well defined requires compatibility of the data distribution with the chosen generative model; this can be achieved, for example, when the data arise from a stochastic differential equation, but is generally restrictive. The FAE objective, on the other hand, is well defined in many situations where FVAE fails to be. Pairing the FVAE and FAE objectives with neural operator architectures that can be evaluated on any mesh enables new applications of autoencoders to inpainting, superresolution, and generative modelling of scientific data.
PROct 16, 2017
Well-posedness of Bayesian inverse problems in quasi-Banach spaces with stable priorsT. J. Sullivan
The Bayesian perspective on inverse problems has attracted much mathematical attention in recent years. Particular attention has been paid to Bayesian inverse problems (BIPs) in which the parameter to be inferred lies in an infinite-dimensional space, a typical example being a scalar or tensor field coupled to some observed data via an ODE or PDE. This article gives an introduction to the framework of well-posed BIPs in infinite-dimensional parameter spaces, as advocated by Stuart (Acta Numer. 19:451--559, 2010) and others. This framework has the advantage of ensuring uniformly well-posed inference problems independently of the finite-dimensional discretisation used for numerical solution. Recently, this framework has been extended to the case of a heavy-tailed prior measure in the family of stable distributions, such as an infinite-dimensional Cauchy distribution, for which polynomial moments are infinite or undefined. It is shown that analogues of the Karhunen--Loève expansion for square-integrable random variables can be used to sample such measures on quasi-Banach spaces. Furthermore, under weaker regularity assumptions than those used to date, the Bayesian posterior measure is shown to depend Lipschitz continuously in the Hellinger and total variation metrics upon perturbations of the misfit function and observed data.
NAApr 22, 2021
Bayesian Numerical Methods for Nonlinear Partial Differential EquationsJunyang Wang, Jon Cockayne, Oksana Chkrebtii et al.
The numerical solution of differential equations can be formulated as an inference problem to which formal statistical approaches can be applied. However, nonlinear partial differential equations (PDEs) pose substantial challenges from an inferential perspective, most notably the absence of explicit conditioning formula. This paper extends earlier work on linear PDEs to a general class of initial value problems specified by nonlinear PDEs, motivated by problems for which evaluations of the right-hand-side, initial conditions, or boundary conditions of the PDE have a high computational cost. The proposed method can be viewed as exact Bayesian inference under an approximate likelihood, which is based on discretisation of the nonlinear differential operator. Proof-of-concept experimental results demonstrate that meaningful probabilistic uncertainty quantification for the unknown solution of the PDE can be performed, while controlling the number of times the right-hand-side, initial and boundary conditions are evaluated. A suitable prior model for the solution of the PDE is identified using novel theoretical analysis of the sample path properties of Matérn processes, which may be of independent interest.
STAug 27, 2020
The linear conditional expectation in Hilbert spaceIlja Klebanov, Björn Sprungk, T. J. Sullivan
The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important rôle in approximate Bayesian inference, especially the Bayes linear approach. This article establishes the analytical properties of the LCE in an infinite-dimensional Hilbert space context. In addition, working in the space of affine Hilbert--Schmidt operators, we establish a regularisation procedure for this LCE. As an important application, we obtain a simple alternative derivation and intuitive justification of the conditional mean embedding formula, a concept widely used in machine learning to perform the conditioning of random variables by embedding them into reproducing kernel Hilbert spaces.
STDec 2, 2019
A Rigorous Theory of Conditional Mean EmbeddingsIlja Klebanov, Ingmar Schuster, T. J. Sullivan
Conditional mean embeddings (CMEs) have proven themselves to be a powerful tool in many machine learning applications. They allow the efficient conditioning of probability distributions within the corresponding reproducing kernel Hilbert spaces (RKHSs) by providing a linear-algebraic relation for the kernel mean embeddings of the respective joint and conditional probability distributions. Both centred and uncentred covariance operators have been used to define CMEs in the existing literature. In this paper, we develop a mathematically rigorous theory for both variants, discuss the merits and problems of each, and significantly weaken the conditions for applicability of CMEs. In the course of this, we demonstrate a beautiful connection to Gaussian conditioning in Hilbert spaces.
NAJan 14, 2019
A Modern Retrospective on Probabilistic NumericsC. J. Oates, T. J. Sullivan
This article attempts to place the emergence of probabilistic numerics as a mathematical-statistical research field within its historical context and to explore how its gradual development can be related both to applications and to a modern formal treatment. We highlight in particular the parallel contributions of Sul'din and Larkin in the 1960s and how their pioneering early ideas have reached a degree of maturity in the intervening period, mediated by paradigms such as average-case analysis and information-based complexity. We provide a subjective assessment of the state of research in probabilistic numerics and highlight some difficulties to be addressed by future works.
NAJul 25, 2018
Convergence Rates of Gaussian ODE FiltersHans Kersting, T. J. Sullivan, Philipp Hennig
A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives \emph{a priori} as a Gauss--Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. This article establishes worst-case local convergence rates of order $q+1$ for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order $q$ in the case of $q=1$ and an integrated Brownian motion prior, and analyses how inaccurate information on $\dot{x}$ coming from approximate evaluations of $f$ affects these rates. Moreover, we show that, in the globally convergent case, the posterior credible intervals are well calibrated in the sense that they globally contract at the same rate as the truncation error. We illustrate these theoretical results by numerical experiments which might indicate their generalizability to $q \in \{2,3,\dots\}$.
PRJun 30, 2017
Well-posed Bayesian inverse problems and heavy-tailed stable quasi-Banach space priorsT. J. Sullivan
This article extends the framework of Bayesian inverse problems in infinite-dimensional parameter spaces, as advocated by Stuart (Acta Numer. 19:451--559, 2010) and others, to the case of a heavy-tailed prior measure in the family of stable distributions, such as an infinite-dimensional Cauchy distribution, for which polynomial moments are infinite or undefined. It is shown that analogues of the Karhunen--Loève expansion for square-integrable random variables can be used to sample such measures on quasi-Banach spaces. Furthermore, under weaker regularity assumptions than those used to date, the Bayesian posterior measure is shown to depend Lipschitz continuously in the Hellinger metric upon perturbations of the misfit function and observed data.