Olivier Zahm

ML
h-index15
10papers
215citations
Novelty53%
AI Score29

10 Papers

NADec 23, 2018
Randomized residual-based error estimators for parametrized equations

Kathrin Smetana, Olivier Zahm, Anthony T Patera

We propose a randomized a posteriori error estimator for reduced order approximations of parametrized (partial) differential equations. The error estimator has several important properties: the effectivity is close to unity with prescribed lower and upper bounds at specified high probability; the estimator does not require the calculation of stability (coercivity, or inf-sup) constants; the online cost to evaluate the a posteriori error estimator is commensurate with the cost to find the reduced order approximation; the probabilistic bounds extend to many queries with only modest increase in cost. To build this estimator, we first estimate the norm of the error with a Monte-Carlo estimator using Gaussian random vectors whose covariance is chosen according to the desired error measure, e.g. user-defined norms or quantity of interest. Then, we introduce a dual problem with random right-hand side the solution of which allows us to rewrite the error estimator in terms of the residual of the original equation. In order to have a fast-to-evaluate estimator, model order reduction methods can be used to approximate the random dual solutions. Here, we propose a greedy algorithm that is guided by a scalar quantity of interest depending on the error estimator. Numerical experiments on a multi-parametric Helmholtz problem demonstrate that this strategy yields rather low-dimensional reduced dual spaces.

NAJan 27, 2016
Interpolation of inverse operators for preconditioning parameter-dependent equations

Olivier Zahm, Anthony Nouy

We propose a method for the construction of preconditioners of parameter-dependent matrices for the solution of large systems of parameter-dependent equations. The proposed method is an interpolation of the matrix inverse based on a projection of the identity matrix with respect to the Frobenius norm. Approximations of the Frobenius norm using random matrices are introduced in order to handle large matrices. The resulting statistical estimators of the Frobenius norm yield quasi-optimal projections that are controlled with high probability. Strategies for the adaptive selection of interpolation points are then proposed for different objectives in the context of projection-based model order reduction methods: the improvement of residual-based error estimators, the improvement of the projection on a given reduced approximation space, or the recycling of computations for sampling based model order reduction methods.

NAOct 30, 2016
Projection based model order reduction methods for the estimation of vector-valued variables of interest

Olivier Zahm, Marie Billaud-Friess, Anthony Nouy

We propose and compare goal-oriented projection based model order reduction methods for the estimation of vector-valued functionals of the solution of parameter-dependent equations. The first projection method is a generalization of the classical primal-dual method to the case of vector-valued variables of interest. We highlight the role played by three reduced spaces: the approximation space and the test space associated to the primal variable, and the approximation space associated to the dual variable. Then we propose a Petrov-Galerkin projection method based on a saddle point problem involving an approximation space for the primal variable and an approximation space for an auxiliary variable. A goal-oriented choice of the latter space, defined as the sum of two spaces, allows us to improve the approximation of the variable of interest compared to a primal-dual method using the same reduced spaces. Then, for both approaches, we derive computable error estimates for the approximations of the variable of interest and we propose greedy algorithms for the goal-oriented construction of reduced spaces. The performance of the algorithms are illustrated on numerical examples and compared to standard (non goal-oriented) algorithms.

MLFeb 27, 2024
Sequential transport maps using SoS density estimation and $α$-divergences

Benjamin Zanger, Olivier Zahm, Tiangang Cui et al.

Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density. We further invertigate the sequential transport maps framework proposed from arXiv:2106.04170 arXiv:2303.02554, which builds on a sequence of composed Knothe-Rosenblatt (KR) maps. Each of those maps are built by first estimating an intermediate density of moderate complexity, and then by computing the exact KR map from a reference density to the precomputed approximate density. In our work, we explore the use of Sum-of-Squares (SoS) densities and $α$-divergences for approximating the intermediate densities. Combining SoS densities with $α$-divergence interestingly yields convex optimization problems which can be efficiently solved using semidefinite programming. The main advantage of $α$-divergences is to enable working with unnormalized densities, which provides benefits both numerically and theoretically. In particular, we provide a new convergence analyses of the sequential transport maps based on information geometric properties of $α$-divergences. The choice of intermediate densities is also crucial for the efficiency of the method. While tempered (or annealed) densities are the state-of-the-art, we introduce diffusion-based intermediate densities which permits to approximate densities known from samples only. Such intermediate densities are well-established in machine learning for generative modeling. Finally we propose low-dimensional maps (or lazy maps) for dealing with high-dimensional problems and numerically demonstrate our methods on Bayesian inference problems and unsupervised learning tasks.

MLJun 19, 2024
Coupled Input-Output Dimension Reduction: Application to Goal-oriented Bayesian Experimental Design and Global Sensitivity Analysis

Qiao Chen, Elise Arnaud, Ricardo Baptista et al.

We introduce a new method to jointly reduce the dimension of the input and output space of a function between high-dimensional spaces. Choosing a reduced input subspace influences which output subspace is relevant and vice versa. Conventional methods focus on reducing either the input or output space, even though both are often reduced simultaneously in practice. Our coupled approach naturally supports goal-oriented dimension reduction, where either an input or output quantity of interest is prescribed. We consider, in particular, goal-oriented sensor placement and goal-oriented sensitivity analysis, which can be viewed as dimension reduction where the most important output or, respectively, input components are chosen. Both applications present difficult combinatorial optimization problems with expensive objectives such as the expected information gain and Sobol' indices. By optimizing gradient-based bounds, we can determine the most informative sensors and most influential parameters as the largest diagonal entries of some diagnostic matrices, thus bypassing the combinatorial optimization and objective evaluation.

MLJun 18, 2024
Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

Matthew T. C. Li, Tiangang Cui, Fengyi Li et al.

Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $π$ as a perturbation of a given reference measure $μ$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Gaussian, as commonly arising in generative modeling. Our method extends prior work on minimizing majorizations of the Kullback--Leibler divergence to identify optimal approximations within this class of measures. Our main contribution unveils a connection between the \emph{dimensional} logarithmic Sobolev inequality (LSI) and approximations with this ansatz. Specifically, when the target and reference are both Gaussian, we show that minimizing the dimensional LSI is equivalent to minimizing the KL divergence restricted to this ansatz. For general non-Gaussian measures, the dimensional LSI produces majorants that uniformly improve on previous majorants for gradient-based dimension reduction. We further demonstrate the applicability of this analysis to the squared Hellinger distance, where analogous reasoning shows that the dimensional Poincaré inequality offers improved bounds.

MLJun 8, 2021
Conditional Deep Inverse Rosenblatt Transports

Tiangang Cui, Sergey Dolgov, Olivier Zahm

We present a novel offline-online method to mitigate the computational burden of Bayesian inference, particularly in the regime where the posterior densities are computationally demanding to evaluate while real-time inference results are needed. In the offline phase, the proposed method learns the joint law of the parameter random variables and the observable random variables in the tensor-train (TT) format. Then, in the online phase, the resulting order-preserving transport can be conditioned on newly observed data to characterize the posterior random variables in real-time. Compared with the state-of-the-art normalizing flows techniques, our proposed method relies on function approximation, for which we can provide a thorough performance analysis. The function approximation perspective allows us to significantly improve the capability of transport maps in challenging problems with high-dimensional observations and high-dimensional parameters. Capitalizing on this, we present novel heuristics to either reorder or reparametrize the variables to enhance the approximation power of TT. We then integrate the TT-based transport maps and the parameter reordering/reparametrization into a layered composite map to further improve the performance of the resulting inference. We demonstrate the efficiency of the proposed method on various statistical learning tasks involving ordinary differential equations (ODEs) and partial differential equations (PDEs).

MLJan 8, 2021
Learning non-Gaussian graphical models via Hessian scores and triangular transport

Ricardo Baptista, Youssef Marzouk, Rebecca E. Morrison et al.

Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply SING to learn the dependencies between the states of a chaotic dynamical system with local interactions.

MLSep 22, 2020
On the representation and learning of monotone triangular transport maps

Ricardo Baptista, Youssef Marzouk, Olivier Zahm

Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps$\unicode{x2014}$approximations of the Knothe$\unicode{x2013}$Rosenblatt (KR) rearrangement$\unicode{x2014}$are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.

COMay 31, 2019
Greedy inference with structure-exploiting lazy maps

Michael C. Brennan, Daniele Bigoni, Olivier Zahm et al.

We propose a framework for solving high-dimensional Bayesian inference problems using \emph{structure-exploiting} low-dimensional transport maps or flows. These maps are confined to a low-dimensional subspace (hence, lazy), and the subspace is identified by minimizing an upper bound on the Kullback--Leibler divergence (hence, structured). Our framework provides a principled way of identifying and exploiting low-dimensional structure in an inference problem. It focuses the expressiveness of a transport map along the directions of most significant discrepancy from the posterior, and can be used to build deep compositions of lazy maps, where low-dimensional projections of the parameters are iteratively transformed to match the posterior. We prove weak convergence of the generated sequence of distributions to the posterior, and we demonstrate the benefits of the framework on challenging inference problems in machine learning and differential equations, using inverse autoregressive flows and polynomial maps as examples of the underlying density estimators.