Han Cheng Lie

ME
h-index10
5papers
32citations
Novelty41%
AI Score44

5 Papers

31.6NAMay 29
Error bounds for approximate posteriors from likelihood-informed reduced-order models

Han Cheng Lie, Jakob Scheffels, Elisabeth Ullmann

In the design of computational methods for Bayesian inverse problems, costly forward model evaluations make it difficult to sample from or compute the posterior. This motivates the need for approximate forward models that are cheaper to evaluate. We consider reduced-order forward models which exploit the lower-dimensional structure in the Bayesian inverse problem by projecting to the "likelihood-informed subspace" of the parameter space where the prior-to-posterior update is significant. However, the theoretical properties of these reduced-order forward models and their impact on the solution of the Baysian inverse problem are not always well-understood. In this work we consider linear Gaussian inverse problems with a possibly singular prior covariance matrix. We analyse a recently proposed reduced-order model which uses a Petrov-Galerkin projection to likelihood-informed subspaces that arise in optimal low-rank approximations of the posterior covariance matrix. We bound the error in the resulting approximation of the root prior-preconditioned Hessian of the data misfit. Based on this we also bound the errors of the approximate posterior covariance and mean. Our analysis shows that this reduced-order model recovers the exact posterior when the rank of the reduced-order model is equal to the "intrinsic dimension" of the inverse problem, i.e. the rank of the prior-preconditioned Hessian. Two numerical experiments from structural engineering illustrate the performance of our bounds.

MEDec 27, 2018
Implicit Probabilistic Integrators for ODEs

Onur Teymur, Han Cheng Lie, Tim Sullivan et al.

We introduce a family of implicit probabilistic integrators for initial value problems (IVPs), taking as a starting point the multistep Adams-Moulton method. The implicit construction allows for dynamic feedback from the forthcoming time-step, in contrast to previous probabilistic integrators, all of which are based on explicit methods. We begin with a concise survey of the rapidly-expanding field of probabilistic ODE solvers. We then introduce our method, which builds on and adapts the work of Conrad et al. (2016) and Teymur et al. (2016), and provide a rigorous proof of its well-definedness and convergence. We discuss the problem of the calibration of such integrators and suggest one approach. We give an illustrative example highlighting the effect of the use of probabilistic integrators - including our new method - in the setting of parameter inference within an inverse problem.

OCMar 18, 2016
Convexity of a stochastic control functional related to importance sampling of Itô diffusions

Han Cheng Lie

We consider the problem of rare event importance sampling, where the random variable of interest is a path functional of an Itô diffusion computed up to the first exit from a $d$-dimensional bounded domain. Dupuis and Wang (\textit{Ann. Appl. Probab.}, 15 (2005), pp. 1-38) studied the importance sampling problem by formulating it as a stochastic optimal control problem, where the value function is related to the conditional cumulant generating function of the random variable. In this paper, we show that the sufficient conditions for the value function to be twice-differentiable and $α$-uniformly Hölder continuous on the closure of the domain are also sufficient conditions for positive definiteness of the second variation of the control functional on the space of differentiable, $α$-uniformly Hölder continuous $\mathbb{R}^d$-valued feedback controls. We derive an expression for the second variation using Kazamaki's sufficient condition for $L^q$-boundedness of exponential martingales, and using Fredholm theory to prove the finiteness of the moment generating function of the first exit time over any bounded interval containing the origin. The strict convexity result suggests that one may be able to solve the corresponding Hamilton-Jacobi-Bellman boundary value problem in a dimension-robust way, by combining convex optimisation and Monte Carlo methods. We apply the result to analyse a gradient descent algorithm proposed by Hartmann and Schütte (\textit{J. Stat. Mech. Theor. Exp.} (2012), P11004) for efficient rare event simulation.

23.8MEMar 20
Goal-oriented learning of stochastic dynamical systems using error bounds on path-space observables

Joanna Zou, Han Cheng Lie, Youssef Marzouk

The governing equations of stochastic dynamical systems often become cost-prohibitive for numerical simulation at large scales. Surrogate models of the governing equations, learned from data of the high-fidelity system, are routinely used to predict key observables with greater efficiency. However, standard choices of loss function for learning the surrogate model fail to provide error guarantees in path-dependent observables, such as reaction rates of molecular dynamical systems. This paper introduces an error bound for path-space observables and employs it as a novel variational loss for the goal-oriented learning of a stochastic dynamical system. We show the error bound holds for a broad class of observables, including mean first hitting times on unbounded time domains. We derive an analytical gradient of the goal-oriented loss function by leveraging the formula for Frechet derivatives of expected path functionals, which remains tractable for implementation in stochastic gradient descent schemes. We demonstrate that surrogate models of overdamped Langevin systems developed via goal-oriented learning achieve improved accuracy in predicting the statistics of a first hitting time observable and robustness to distributional shift in the data.

LGOct 30, 2024
Data subsampling for Poisson regression with pth-root-link

Han Cheng Lie, Alexander Munteanu

We develop and analyze data subsampling techniques for Poisson regression, the standard model for count data $y\in\mathbb{N}$. In particular, we consider the Poisson generalized linear model with ID- and square root-link functions. We consider the method of coresets, which are small weighted subsets that approximate the loss function of Poisson regression up to a factor of $1\pm\varepsilon$. We show $Ω(n)$ lower bounds against coresets for Poisson regression that continue to hold against arbitrary data reduction techniques up to logarithmic factors. By introducing a novel complexity parameter and a domain shifting approach, we show that sublinear coresets with $1\pm\varepsilon$ approximation guarantee exist when the complexity parameter is small. In particular, the dependence on the number of input points can be reduced to polylogarithmic. We show that the dependence on other input parameters can also be bounded sublinearly, though not always logarithmically. In particular, we show that the square root-link admits an $O(\log(y_{\max}))$ dependence, where $y_{\max}$ denotes the largest count presented in the data, while the ID-link requires a $Θ(\sqrt{y_{\max}/\log(y_{\max})})$ dependence. As an auxiliary result for proving the tightness of the bound with respect to $y_{\max}$ in the case of the ID-link, we show an improved bound on the principal branch of the Lambert $W_0$ function, which may be of independent interest. We further show the limitations of our analysis when $p$th degree root-link functions for $p\geq 3$ are considered, which indicate that other analytical or computational methods would be required if such a generalization is even possible.