37.6LGJun 3
Neural Galerkin Normalizing Flows for Bayesian Inference of Diffusions with Inaccessible BoundariesRiccardo Saporiti, Fabio Nobile
One of the primary challenges in Bayesian inference on the parameters of a diffusion model from discrete observations is the unavailability of an analytical expression for the transition density function between consecutive observation times, which is needed to derive the likelihood function. Extending previous studies that solve Fokker-Planck (FP) type partial differential equations with Normalizing Flows, we propose a new Normalizing Flow architecture to learn the transition density function of the diffusion process between two observation times. We do so by solving in a Neural Galerkin framework the associated FP equation with a Dirac mass as initial condition, over a specified training distribution of the initial datum and the coefficients of the diffusion. We specifically focus on processes whose diffusion matrix vanishes in certain inaccessible boundary regions, such as Stochastic Volatility models that satisfy a Feller condition. The product of the obtained transition densities evaluated along the observed trajectory approximates the likelihood function, thereby enabling cheap posterior sampling via Markov chain Monte Carlo (MCMC). After the offline training phase, inference becomes significantly more efficient, as it avoids the need to solve the FP equation in real time for each parameter proposed by the MCMC sampler or to rely on other likelihood-free methods for Bayesian inference that involve repeated simulation of diffusion bridges.
NAMar 28, 2016
Analytic regularity and collocation approximation for elliptic PDEs with random domain deformationsJulio E. Castrillon-Candas, Fabio Nobile, Raul F. Tempone
In this work we consider the problem of approximating the statistics of a given Quantity of Interest (QoI) that depends on the solution of a linear elliptic PDE defined over a random domain parameterized by $N$ random variables. The elliptic problem is remapped on to a corresponding PDE with a fixed deterministic domain. We show that the solution can be analytically extended to a well defined region in $\C^{N}$ with respect to the random variables. A sparse grid stochastic collocation method is then used to compute the mean and standard deviation of the QoI. Finally, convergence rates for the mean and variance of the QoI are derived and compared to those obtained in numerical experiments.
NAJun 7, 2018
Least-Squares Padé approximation of parametric and stochastic Helmholtz mapsFrancesca Bonizzoni, Fabio Nobile, Ilaria Perugia et al.
The present work deals with the rational model order reduction method based on the single-point Least-Square (LS) Padé approximation technique introduced in [3]. Algorithmical aspects concerning the construction of the rational LS-Padé approximant are described. In particular, the computation of the Padé denominator is reduced to the calculation of the eigenvector corresponding to the minimal eigenvalue of a Gramian matrix. The LS-Padé technique is employed to approximate the frequency response map associated to various parametric time-harmonic wave problems, namely, a transmission/reflection problem, a scattering problem and a problem in high-frequency regime. In all cases we establish the meromorphy of the frequency response map. The Helmholtz equation with stochastic wavenumber is also considered. In particular, for Lipschitz functionals of the solution, and their corresponding probability measures, we establish weak convergence of the measure derived from the LS-Padé approximant to the true one. 2D numerical tests are performed, which confirm the effectiveness of the approximation method.
NAOct 8, 2017
Multilevel weighted least squares polynomial approximationAbdul-Lateef Haji-Ali, Fabio Nobile, Raúl Tempone et al.
Weighted least squares polynomial approximation uses random samples to determine projections of functions onto spaces of polynomials. It has been shown that, using an optimal distribution of sample locations, the number of samples required to achieve quasi-optimal approximation in a given polynomial subspace scales, up to a logarithmic factor, linearly in the dimension of this space. However, in many applications, the computation of samples includes a numerical discretization error. Thus, obtaining polynomial approximations with a single level method can become prohibitively expensive, as it requires a sufficiently large number of samples, each computed with a sufficiently small discretization error. As a solution to this problem, we propose a multilevel method that utilizes samples computed with different accuracies and is able to match the accuracy of single-level approximations with reduced computational cost. We derive complexity bounds under certain assumptions about polynomial approximability and sample work. Furthermore, we propose an adaptive algorithm for situations where such assumptions cannot be verified a priori. Finally, we provide an efficient algorithm for the sampling from optimal distributions and an analysis of computationally favorable alternative distributions. Numerical experiments underscore the practical applicability of our method.
NAMar 29, 2017
Hybrid collocation perturbation for PDEs with random domainsJulio E. Castrillon-Candas, Fabio Nobile, Raul F. Tempone
In this work we consider the problem of approximating the statistics of a given Quantity of Interest (QoI) that depends on the solution of a linear elliptic PDE defined over a random domain parameterized by $N$ random variables. The random domain is split into large and small variations contributions. The large variations are approximated by applying a sparse grid stochastic collocation method. The small variations are approximated with a stochastic collocation-perturbation method. Convergence rates for the variance of the QoI are derived and compared to those obtained in numerical experiments. Our approach significantly reduces the dimensionality of the stochastic problem. The computational cost of this method increases at most quadratically with respect to the number of dimensions of the small variations. Moreover, for the case that the small and large variations are independent the cost increases linearly.
NAOct 24, 2016
Discrete least-squares approximations over optimized downward closed polynomial spaces in arbitrary dimensionAlbert Cohen, Giovanni Migliorati, Fabio Nobile
We analyze the accuracy of the discrete least-squares approximation of a function $u$ in multivariate polynomial spaces $\mathbb{P}_Λ:={\rm span} \{y\mapsto y^ν\,: \, ν\in Λ\}$ with $Λ\subset \mathbb{N}_0^d$ over the domain $Γ:=[-1,1]^d$, based on the sampling of this function at points $y^1,\dots,y^m \in Γ$. The samples are independently drawn according to a given probability density $ρ$ belonging to the class of multivariate beta densities, which includes the uniform and Chebyshev densities as particular cases. We restrict our attention to polynomial spaces associated with \emph{downward closed} sets $Λ$ of \emph{prescribed} cardinality $n$, and we optimize the choice of the space for the given sample. This implies, in particular, that the selected polynomial space depends on the sample. We are interested in comparing the error of this least-squares approximation measured in $L^2(Γ,dρ)$ with the best achievable polynomial approximation error when using downward closed sets of cardinality $n$. We establish conditions between the dimension $n$ and the size $m$ of the sample, under which these two errors are proven to be comparable. Our main finding is that the dimension $d$ enters only moderately in the resulting trade-off between $m$ and $n$, in terms of a logarithmic factor $\ln(d)$, and is even absent when the optimization is restricted to a relevant subclass of downward closed sets, named {\it anchored} sets. In principle, this allows one to use these methods in arbitrarily high or even infinite dimension. Our analysis builds upon [2] which considered fixed and nonoptimized downward closed multi-index sets. Potential applications of the proposed results are found in the development and analysis of numerical methods for computing the solution to high-dimensional parametric or stochastic PDEs.
NAJul 16, 2018
Sparse approximation of multilinear problems with applications to kernel-based methods in UQFabio Nobile, Raul Tempone, Soeren Wolfers
We provide a framework for the sparse approximation of multilinear problems and show that several problems in uncertainty quantification fit within this framework. In these problems, the value of a multilinear map has to be approximated using approximations of different accuracy and computational work of the arguments of this map. We propose and analyze a generalized version of Smolyak's algorithm, which provides sparse approximation formulas with convergence rates that mitigate the curse of dimension that appears in multilinear approximation problems with a large number of arguments. We apply the general framework to response surface approximation and optimization under uncertainty for parametric partial differential equations using kernel-based approximation. The theoretical results are supplemented by numerical experiments.
NAJul 16, 2012
Sparse spectral approximations for computing polynomial functionalsErwan Faou, Fabio Nobile, Christophe Vuillot
We give a new fast method for evaluating sprectral approximations of nonlinear polynomial functionals. We prove that the new algorithm is convergent if the functions considered are smooth enough, under a general assumption on the spectral eigenfunctions that turns out to be satisfied in many cases, including the Fourier and Hermite basis.
71.6NAMay 23
Optimized multilevel Monte Carlo methods in Banach spacesKristin Kirchner, Fabio Nobile, Christoph Schwab et al.
We present a theoretical and numerical analysis of Monte Carlo methods for the estimation of statistical moments of random variables $X:Ω\rightarrow E$ taking values in a Banach space $E$. For practical computation, we consider finite-dimensional approximation subspaces ${(E_\ell)_{\ell\in\mathbb{N}}\subset E}$ of increasing dimension. We develop a refined error analysis that explicitly accounts for a dependence of the Rademacher type constants on the dimension of $E_\ell$, leading to novel complexity results for single- and multilevel Monte Carlo methods to estimate the mean and injective moments of arbitrary order, which are, in certain cases, sharper than those derived in [Kirchner, Schwab, J. Funct. Anal, 2024]. Moreover, we show that, in favorable cases, the resulting error-vs.-work bounds are independent of the Rademacher type of $E$. We then focus on $L^p(S)$-valued random variables for a $σ$-finite measure space satisfying certain approximation properties, and prove that for a random variable $X\in L^q(Ω;L^p(S))\cap L^p(S;L^q(Ω))$, with $q\in (1,\infty)$ and $p\in [1,\infty)$, the $L^q$-convergence rate of a Monte Carlo estimator is determined exclusively by the integrability parameter $\min\{q,2\}$, with no dependence on the Rademacher type $\min\{p,2\}$ of $L^p(S)$. We further investigate the impact of measuring the (multilevel) Monte Carlo error in the $L^q(Ω;L^p(S))$-norm while $X$ possesses additional regularity, $X\in L^{\tilde{q}}(Ω;L^p(S))\cap L^p(S;L^{\tilde{q}}(Ω))$ with $\tilde{q}\in [q,\infty)$. This analysis reveals an interplay between the sampling error and the strong approximation error, and leads to optimized error-vs.-work bounds for both single- and multilevel Monte Carlo methods. Numerical experiments confirm the sharpness of the analyses presented.
8.9LGMar 19
Neural Galerkin Normalizing Flow for Transition Probability Density Functions of Diffusion ModelsRiccardo Saporiti, Fabio Nobile
We propose a new Neural Galerkin Normalizing Flow framework to approximate the transition probability density function of a diffusion process by solving the corresponding Fokker-Planck equation with an atomic initial distribution, parametrically with respect to the location of the initial mass. By using Normalizing Flows, we look for the solution as a transformation of the transition probability density function of a reference stochastic process, ensuring that our approximation is structure-preserving and automatically satisfies positivity and mass conservation constraints. By extending Neural Galerkin schemes to the context of Normalizing Flows, we derive a system of ODEs for the time evolution of the Normalizing Flow's parameters. Adaptive sampling routines are used to evaluate the Fokker-Planck residual in meaningful locations, which is of vital importance to address high-dimensional PDEs. Numerical results show that this strategy captures key features of the true solution and enforces the causal relationship between the initial datum and the density function at subsequent times. After completing an offline training phase, online evaluation becomes significantly more cost-effective than solving the PDE from scratch. The proposed method serves as a promising surrogate model, which could be deployed in many-query problems associated with stochastic differential equations, like Bayesian inference, simulation, and diffusion bridge generation.
LGMar 3
LAGO: A Local-Global Optimization Framework Combining Trust Region Methods and Bayesian OptimizationEliott Van Dieren, Tommaso Vanzan, Fabio Nobile
We introduce LAGO, a LocAl-Global Optimization algorithm that combines gradient-enhanced Bayesian Optimization (BO) with gradient-based trust region local refinement through an adaptive competition mechanism. At each iteration, global and local optimization strategies independently propose candidate points, and the next evaluation is selected based on predicted improvement. LAGO separates global exploration from local refinement at the proposal level: the BO acquisition function is optimized outside the active trust region, while local function and gradient evaluations are incorporated into the global gradient-enhanced Gaussian process only when they satisfy a lengthscale-based minimum-distance criterion, reducing the risk of numerical instability during the local exploitation. This enables efficient local refinement when reaching promising regions, without sacrificing a global search of the design space. As a result, the method achieves an improved exploration of the full design space compared to standard non-linear local optimization algorithms for smooth functions, while maintaining fast local convergence in regions of interest.
MLDec 23, 2019
Sparse Polynomial Chaos expansions using Variational Relevance Vector MachinesPanagiotis Tsilifis, Iason Papaioannou, Daniel Straub et al.
The challenges for non-intrusive methods for Polynomial Chaos modeling lie in the computational efficiency and accuracy under a limited number of model simulations. These challenges can be addressed by enforcing sparsity in the series representation through retaining only the most important basis terms. In this work, we present a novel sparse Bayesian learning technique for obtaining sparse Polynomial Chaos expansions which is based on a Relevance Vector Machine model and is trained using Variational Inference. The methodology shows great potential in high-dimensional data-driven settings using relatively few data points and achieves user-controlled sparse levels that are comparable to other methods such as compressive sensing. The proposed approach is illustrated on two numerical examples, a synthetic response function that is explored for validation purposes and a low-carbon steel plate with random Young's modulus and random loading, which is modeled by stochastic finite element with 38 input random variables.
NAJul 25, 2017
Uncertainty Quantification of geochemical and mechanical compaction in layered sedimentary basinsIvo Colombo, Fabio Nobile, Giovanni Porta et al.
In this work we propose an Uncertainty Quantification methodology for sedimentary basins evolution under mechanical and geochemical compaction processes, which we model as a coupled, time-dependent, non-linear, monodimensional (depth-only) system of PDEs with uncertain parameters. While in previous works (Formaggia et al. 2013, Porta et al., 2014) we assumed a simplified depositional history with only one material, in this work we consider multi-layered basins, in which each layer is characterized by a different material, and hence by different properties. This setting requires several improvements with respect to our earlier works, both concerning the deterministic solver and the stochastic discretization. On the deterministic side, we replace the previous fixed-point iterative solver with a more efficient Newton solver at each step of the time-discretization. On the stochastic side, the multi-layered structure gives rise to discontinuities in the dependence of the state variables on the uncertain parameters, that need an appropriate treatment for surrogate modeling techniques, such as sparse grids, to be effective. We propose an innovative methodology to this end which relies on a change of coordinate system to align the discontinuities of the target function within the random parameter space. The reference coordinate system is built upon exploiting physical features of the problem at hand. We employ the locations of material interfaces, which display a smooth dependence on the random parameters and are therefore amenable to sparse grid polynomial approximations. We showcase the capabilities of our numerical methodologies through two synthetic test cases. In particular, we show that our methodology reproduces with high accuracy multi-modal probability density functions displayed by target state variables (e.g., porosity).
NAAug 30, 2016
Multilevel ensemble Kalman filtering for spatially extended modelsAlexey Chernov, Haakon Hoel, Kody Law et al.
This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF), thereby yielding a multilevel ensemble Kalman filter (MLEnKF) which has provably superior asymptotic cost to a given accuracy level. The development of MLEnKF for finite-dimensional state-spaces in the work [20] is here extended to models with infinite-dimensional state- spaces in the form of spatial fields. A concrete example is given to illustrate the results.
NAJul 21, 2016
Multi-Index Stochastic Collocation for random PDEsAbdul-Lateef Haji-Ali, Fabio Nobile, Lorenzo Tamellini et al.
In this work we introduce the Multi-Index Stochastic Collocation method (MISC) for computing statistics of the solution of a PDE with random data. MISC is a combination technique based on mixed differences of spatial approximations and quadratures over the space of random data. We propose an optimization procedure to select the most effective mixed differences to include in the MISC estimator: such optimization is a crucial step and allows us to build a method that, provided with sufficient solution regularity, is potentially more effective than other multi-level collocation methods already available in literature. We then provide a complexity analysis that assumes decay rates of product type for such mixed differences, showing that in the optimal case the convergence rate of MISC is only dictated by the convergence of the deterministic solver applied to a one dimensional problem. We show the effectiveness of MISC with some computational tests, comparing it with other related methods available in the literature, such as the Multi-Index and Multilevel Monte Carlo, Multilevel Stochastic Collocation, Quasi Optimal Stochastic Collocation and Sparse Composite Collocation methods.
NAJul 21, 2016
Multi-index Stochastic Collocation convergence rates for random PDEs with parametric regularityAbdul-Lateef Haji-Ali, Fabio Nobile, Lorenzo Tamellini et al.
We analyze the recent Multi-index Stochastic Collocation (MISC) method for computing statistics of the solution of a partial differential equation (PDEs) with random data, where the random coefficient is parametrized by means of a countable sequence of terms in a suitable expansion. MISC is a combination technique based on mixed differences of spatial approximations and quadratures over the space of random data and, naturally, the error analysis uses the joint regularity of the solution with respect to both the variables in the physical domain and parametric variables. In MISC, the number of problem solutions performed at each discretization level is not determined by balancing the spatial and stochastic components of the error, but rather by suitably extending the knapsack-problem approach employed in the construction of the quasi-optimal sparse-grids and Multi-index Monte Carlo methods. We use a greedy optimization procedure to select the most effective mixed differences to include in the MISC estimator. We apply our theoretical estimates to a linear elliptic PDEs in which the log-diffusion coefficient is modeled as a random field, with a covariance similar to a Matérn model, whose realizations have spatial regularity determined by a scalar parameter. We conduct a complexity analysis based on a summability argument showing algebraic rates of convergence with respect to the overall computational work. The rate of convergence depends on the smoothness parameter, the physical dimensionality and the efficiency of the linear solver. Numerical experiments show the effectiveness of MISC in this infinite-dimensional setting compared with the Multi-index Monte Carlo method and compare the convergence rate against the rates predicted in our theoretical analysis.
NAMar 25, 2015
Optimization of mesh hierarchies in Multilevel Monte Carlo samplersAbdul Lateef Haji Ali, Fabio Nobile, Erik von Schwerin et al.
We perform a general optimization of the parameters in the Multilevel Monte Carlo (MLMC) discretization hierarchy based on uniform discretization methods with general approximation orders and computational costs. We optimize hierarchies with geometric and non-geometric sequences of mesh sizes and show that geometric hierarchies, when optimized, are nearly optimal and have the same asymptotic computational complexity as non-geometric optimal hierarchies. We discuss how enforcing constraints on parameters of MLMC hierarchies affects the optimality of these hierarchies. These constraints include an upper and a lower bound on the mesh size or enforcing that the number of samples and the number of discretization elements are integers. We also discuss the optimal tolerance splitting between the bias and the statistical error contributions and its asymptotic behavior. To provide numerical grounds for our theoretical results, we apply these optimized hierarchies together with the Continuation MLMC Algorithm. The first example considers a three-dimensional elliptic partial differential equation with random inputs. Its space discretization is based on continuous piecewise trilinear finite elements and the corresponding linear system is solved by either a direct or an iterative solver. The second example considers a one-dimensional Itô stochastic differential equation discretized by a Milstein scheme.
NAMar 25, 2015
Multi-Index Monte Carlo: When Sparsity Meets SamplingAbdul-Lateef Haji-Ali, Fabio Nobile, Raul Tempone
We propose and analyze a novel Multi-Index Monte Carlo (MIMC) method for weak approximation of stochastic models that are described in terms of differential equations either driven by random measures or with random coefficients. The MIMC method is both a stochastic version of the combination technique introduced by Zenger, Griebel and collaborators and an extension of the Multilevel Monte Carlo (MLMC) method first described by Heinrich and Giles. Inspired by Giles's seminal work, we use in MIMC high-order mixed differences instead of using first-order differences as in MLMC to reduce the variance of the hierarchical differences dramatically. This in turn yields new and improved complexity results, which are natural generalizations of Giles's MLMC analysis and which increase the domain of the problem parameters for which we achieve the optimal convergence, $\mathcal{O}(\text{TOL}^{-2}).$ Moreover, in MIMC, the rate of increase of required memory with respect to $\text{TOL}$ is independent of the number of directions up to a logarithmic term which allows far more accurate solutions to be calculated for higher dimensions than what is possible when using MLMC. We motivate the setting of MIMC by first focusing on a simple full tensor index set. We then propose a systematic construction of optimal sets of indices for MIMC based on properly defined profits that in turn depend on the average cost per sample and the corresponding weak error and variance. Under standard assumptions on the convergence rates of the weak error, variance and work per sample, the optimal index set turns out to be the total degree (TD) type. In some cases, using optimal index sets, MIMC achieves a better rate for the computational complexity than the corresponding rate when using full tensor index sets...