Michael D. Shields

h-index46

26papers

654citations

Novelty51%

AI Score58

Ranked #4,750 of 194,257 authors (top 2%)#1,305 in LG (top 3%)

26 Papers

29.5LGApr 20, 2022Code

Deep transfer operator learning for partial differential equations under conditional shift

Somdatta Goswami, Katiana Kontolati, Michael D. Shields et al.

Transfer learning (TL) enables the transfer of knowledge gained in learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labeling, potential computational power limitations, and dataset distribution mismatches. We propose a new TL framework for task-specific learning (functional regression in partial differential equations (PDEs)) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of target data. Inspired by the conditional embedding operator theory, we minimize the statistical distance between labeled target data and the surrogate prediction on unlabeled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various TL scenarios involving nonlinear PDEs under diverse conditions due to shift in the geometric domain and model dynamics. Our TL framework enables fast and efficient learning of heterogeneous tasks despite significant differences between the source and target domains.

20.2LGMar 9, 2022Code

On the influence of over-parameterization in manifold based surrogates and deep neural operators

Katiana Kontolati, Somdatta Goswami, Michael D. Shields et al.

Constructing accurate and generalizable approximators for complex physico-chemical processes exhibiting highly non-smooth dynamics is challenging. In this work, we propose new developments and perform comparisons for two promising approaches: manifold-based polynomial chaos expansion (m-PCE) and the deep neural operator (DeepONet), and we examine the effect of over-parameterization on generalization. We demonstrate the performance of these methods in terms of generalization accuracy by solving the 2D time-dependent Brusselator reaction-diffusion system with uncertainty sources, modeling an autocatalytic chemical reaction between two species. We first propose an extension of the m-PCE by constructing a mapping between latent spaces formed by two separate embeddings of input functions and output QoIs. To enhance the accuracy of the DeepONet, we introduce weight self-adaptivity in the loss function. We demonstrate that the performance of m-PCE and DeepONet is comparable for cases of relatively smooth input-output mappings. However, when highly non-smooth dynamics is considered, DeepONet shows higher accuracy. We also find that for m-PCE, modest over-parameterization leads to better generalization, both within and outside of distribution, whereas aggressive over-parameterization leads to over-fitting. In contrast, an even highly over-parameterized DeepONet leads to better generalization for both smooth and non-smooth dynamics. Furthermore, we compare the performance of the above models with another operator learning model, the Fourier Neural Operator, and show that its over-parameterization also leads to better generalization. Our studies show that m-PCE can provide very good accuracy at very low training cost, whereas a highly over-parameterized DeepONet can provide better accuracy and robustness to noise but at higher training cost. In both methods, the inference cost is negligible.

18.8LGAug 5, 2024Code

Synergistic Learning with Multi-Task DeepONet for Efficient PDE Problem Solving

Varun Kumar, Somdatta Goswami, Katiana Kontolati et al.

Multi-task learning (MTL) is an inductive transfer mechanism designed to leverage useful information from multiple tasks to improve generalization performance compared to single-task learning. It has been extensively explored in traditional machine learning to address issues such as data sparsity and overfitting in neural networks. In this work, we apply MTL to problems in science and engineering governed by partial differential equations (PDEs). However, implementing MTL in this context is complex, as it requires task-specific modifications to accommodate various scenarios representing different physical processes. To this end, we present a multi-task deep operator network (MT-DeepONet) to learn solutions across various functional forms of source terms in a PDE and multiple geometries in a single concurrent training session. We introduce modifications in the branch network of the vanilla DeepONet to account for various functional forms of a parameterized coefficient in a PDE. Additionally, we handle parameterized geometries by introducing a binary mask in the branch network and incorporating it into the loss term to improve convergence and generalization to new geometry tasks. Our approach is demonstrated on three benchmark problems: (1) learning different functional forms of the source term in the Fisher equation; (2) learning multiple geometries in a 2D Darcy Flow problem and showcasing better transfer learning capabilities to new geometries; and (3) learning 3D parameterized geometries for a heat transfer problem and demonstrate the ability to predict on new but similar geometries. Our MT-DeepONet framework offers a novel approach to solving PDE problems in engineering and science under a unified umbrella based on synergistic learning that reduces the overall training cost for neural operators.

17.5LGApr 15, 2023Code

Learning in latent spaces improves the predictive accuracy of deep neural operators

Katiana Kontolati, Somdatta Goswami, George Em Karniadakis et al.

Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.

6.6LGAug 27, 2023

On Active Learning for Gaussian Process-based Global Sensitivity Analysis

Mohit Chauhan, Mariel Ojeda-Tuz, Ryan Catarelli et al.

This paper explores the application of active learning strategies to adaptively learn Sobol indices for global sensitivity analysis. We demonstrate that active learning for Sobol indices poses unique challenges due to the definition of the Sobol index as a ratio of variances estimated from Gaussian process surrogates. Consequently, learning strategies must either focus on convergence in the numerator or the denominator of this ratio. However, rapid convergence in either one does not guarantee convergence in the Sobol index. We propose a novel strategy for active learning that focuses on resolving the main effects of the Gaussian process (associated with the numerator of the Sobol index) and compare this with existing strategies based on convergence in the total variance (the denominator of the Sobol index). The new strategy, implemented through a new learning function termed the MUSIC (minimize uncertainty in Sobol index convergence), generally converges in Sobol index error more rapidly than the existing strategies based on the Expected Improvement for Global Fit (EIGF) and the Variance Improvement for Global Fit (VIGF). Both strategies are compared with simple sequential random sampling and the MUSIC learning function generally converges most rapidly for low-dimensional problems. However, for high-dimensional problems, the performance is comparable to random sampling. The new learning strategy is demonstrated for a practical case of adaptive experimental design for large-scale Boundary Layer Wind Tunnel experiments.

9.2LGJul 17, 2024

A Resolution Independent Neural Operator

Bahador Bahmani, Somdatta Goswami, Ioannis G. Kevrekidis et al.

The Deep Operator Network (DeepONet) is a powerful neural operator architecture that uses two neural networks to map between infinite-dimensional function spaces. This architecture allows for the evaluation of the solution field at any location within the domain but requires input functions to be discretized at identical locations, limiting practical applications. We introduce a general framework for operator learning from input-output data with arbitrary sensor locations and counts. This begins by introducing a resolution-independent DeepONet (RI-DeepONet), which handles input functions discretized arbitrarily but sufficiently finely. To achieve this, we propose two dictionary learning algorithms that adaptively learn continuous basis functions, parameterized as implicit neural representations (INRs), from correlated signals on arbitrary point clouds. These basis functions project input function data onto a finite-dimensional embedding space, making it compatible with DeepONet without architectural changes. We specifically use sinusoidal representation networks (SIRENs) as trainable INR basis functions. Similarly, the dictionary learning algorithms identify basis functions for output data, defining a new neural operator architecture: the Resolution Independent Neural Operator (RINO). In RINO, the operator learning task reduces to mapping coefficients of input basis functions to output basis functions. We demonstrate RINO's robustness and applicability in handling arbitrarily sampled input and output functions during both training and inference through several numerical examples.

7.1CEJun 4

Physics-constrained Gaussian Processes for Predicting Shockwave Hugoniot Curves

George D. Pasparakis, Himanshu Sharma, Rushik Desai et al.

A physics-constrained Gaussian Process regression framework is developed for predicting shocked material states and their associated uncertainties along the Hugoniot curve using data from a small number of shockwave simulations. The proposed Gaussian process is constrained by the Rankine-Hugoniot jump conditions between the various shocked material states to construct a thermodynamically consistent covariance function. This leads to the formulation of an optimization problem over a small number of interpretable hyperparameters and enables the identification of regime transitions, from a leading elastic wave to trailing plastic and phase transformation waves. Shock Hugoniots are an important measure for understanding material behavior under extreme conditions, including for the development of equations of state and determining material properties such as the Hugoniot Elastic Limit, but they are costly to generate through large-scale molecular dynamics simulations or shock experiments. Under these constraints, the proposed methodology establishes Hugoniot curves from a limited number of molecular dynamics simulations. We consider silicon carbide as a representative material and Molecular Dynamics simulations are performed using a reverse ballistic approach. The framework reproduces the Hugoniot curve with satisfactory accuracy while also quantifying the uncertainty in the predictions using the Gaussian Process posterior. These uncertain Hugoniot predictions can then be used to calibrate equation of state models, estimate material properties, or inform future experimental and/or simulation campaigns.

7.7LGJan 31, 2023

Active Learning-based Domain Adaptive Localized Polynomial Chaos Expansion

Lukáš Novák, Michael D. Shields, Václav Sadílek et al.

The paper presents a novel methodology to build surrogate models of complicated functions by an active learning-based sequential decomposition of the input random space and construction of localized polynomial chaos expansions, referred to as domain adaptive localized polynomial chaos expansion (DAL-PCE). The approach utilizes sequential decomposition of the input random space into smaller sub-domains approximated by low-order polynomial expansions. This allows approximation of functions with strong nonlinearties, discontinuities, and/or singularities. Decomposition of the input random space and local approximations alleviates the Gibbs phenomenon for these types of problems and confines error to a very small vicinity near the non-linearity. The global behavior of the surrogate model is therefore significantly better than existing methods as shown in numerical examples. The whole process is driven by an active learning routine that uses the recently proposed $Θ$ criterion to assess local variance contributions. The proposed approach balances both \emph{exploitation} of the surrogate model and \emph{exploration} of the input random space and thus leads to efficient and accurate approximation of the original mathematical model. The numerical results show the superiority of the DAL-PCE in comparison to (i) a single global polynomial chaos expansion and (ii) the recently proposed stochastic spectral embedding (SSE) method developed as an accurate surrogate model and which is based on a similar domain decomposition process. This method represents general framework upon which further extensions and refinements can be based, and which can be combined with any technique for non-intrusive polynomial chaos expansion construction.

3.3LGDec 7, 2022

General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulation

Promit Chakroborty, Somayajulu L. N. Dhulipala, Yifeng Che et al.

Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity surrogate modeling strategy in which the multi-fidelity surrogate is assembled using an active learning strategy using an on-the-fly model adequacy assessment set within a subset simulation framework for efficient reliability analysis. The multi-fidelity surrogate is assembled by first applying a Gaussian process correction to each low-fidelity model and assigning a model probability based on the model's local predictive accuracy and cost. Three strategies are proposed to fuse these individual surrogates into an overall surrogate model based on model averaging and deterministic/stochastic model selection. The strategies also dictate which model evaluations are necessary. No assumptions are made about the relationships between low-fidelity models, while the high-fidelity model is assumed to be the most accurate and most computationally expensive model. Through two analytical and two numerical case studies, including a case study evaluating the failure probability of Tristructural isotropic-coated (TRISO) nuclear fuels, the algorithm is shown to be highly accurate while drastically reducing the number of high-fidelity model calls (and hence computational cost).

4.6LGAug 12, 2022Code

Bayesian Inference with Latent Hamiltonian Neural Networks

Somayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields

When sampling for Bayesian inference, one popular approach is to use Hamiltonian Monte Carlo (HMC) and specifically the No-U-Turn Sampler (NUTS) which automatically decides the end time of the Hamiltonian trajectory. However, HMC and NUTS can require numerous numerical gradients of the target density, and can prove slow in practice. We propose Hamiltonian neural networks (HNNs) with HMC and NUTS for solving Bayesian inference problems. Once trained, HNNs do not require numerical gradients of the target density during sampling. Moreover, they satisfy important properties such as perfect time reversibility and Hamiltonian conservation, making them well-suited for use within HMC and NUTS because stationarity can be shown. We also propose an HNN extension called latent HNNs (L-HNNs), which are capable of predicting latent variable outputs. Compared to HNNs, L-HNNs offer improved expressivity and reduced integration errors. Finally, we employ L-HNNs in NUTS with an online error monitoring scheme to prevent sample degeneracy in regions of low probability density. We demonstrate L-HNNs in NUTS with online error monitoring on several examples involving complex, heavy-tailed, and high-local-curvature probability densities. Overall, L-HNNs in NUTS with online error monitoring satisfactorily inferred these probability densities. Compared to traditional NUTS, L-HNNs in NUTS with online error monitoring required 1--2 orders of magnitude fewer numerical gradients of the target density and improved the effective sample size (ESS) per gradient by an order of magnitude.

2.1MLSep 19, 2022Code

Physics-Informed Machine Learning of Dynamical Systems for Efficient Bayesian Inference

Somayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields

Although the no-u-turn sampler (NUTS) is a widely adopted method for performing Bayesian inference, it requires numerous posterior gradients which can be expensive to compute in practice. Recently, there has been a significant interest in physics-based machine learning of dynamical (or Hamiltonian) systems and Hamiltonian neural networks (HNNs) is a noteworthy architecture. But these types of architectures have not been applied to solve Bayesian inference problems efficiently. We propose the use of HNNs for performing Bayesian inference efficiently without requiring numerous posterior gradients. We introduce latent variable outputs to HNNs (L-HNNs) for improved expressivity and reduced integration errors. We integrate L-HNNs in NUTS and further propose an online error monitoring scheme to prevent sampling degeneracy in regions where L-HNNs may have little training data. We demonstrate L-HNNs in NUTS with online error monitoring considering several complex high-dimensional posterior densities and compare its performance to NUTS.

10.7LGSep 4, 2023

Physics-Informed Polynomial Chaos Expansions

Lukáš Novák, Himanshu Sharma, Michael D. Shields

Surrogate modeling of costly mathematical models representing physical systems is challenging since it is typically not possible to create a large experimental design. Thus, it is beneficial to constrain the approximation to adhere to the known physics of the model. This paper presents a novel methodology for the construction of physics-informed polynomial chaos expansions (PCE) that combines the conventional experimental design with additional constraints from the physics of the model. Physical constraints investigated in this paper are represented by a set of differential equations and specified boundary conditions. A computationally efficient means for construction of physically constrained PCE is proposed and compared to standard sparse PCE. It is shown that the proposed algorithms lead to superior accuracy of the approximation and does not add significant computational burden. Although the main purpose of the proposed method lies in combining data and physical constraints, we show that physically constrained PCEs can be constructed from differential equations and boundary conditions alone without requiring evaluations of the original model. We further show that the constrained PCEs can be easily applied for uncertainty quantification through analytical post-processing of a reduced PCE filtering out the influence of all deterministic space-time variables. Several deterministic examples of increasing complexity are provided and the proposed method is applied for uncertainty quantification.

4.5MLDec 11, 2025

Physics-informed Polynomial Chaos Expansion with Enhanced Constrained Optimization Solver and D-optimal Sampling

Qitian Lu, Himanshu Sharma, Michael D. Shields et al.

Physics-informed polynomial chaos expansions (PC$^2$) provide an efficient physically constrained surrogate modeling framework by embedding governing equations and other physical constraints into the standard data-driven polynomial chaos expansions (PCE) and solving via the Karush-Kuhn-Tucker (KKT) conditions. This approach improves the physical interpretability of surrogate models while achieving high computational efficiency and accuracy. However, the performance and efficiency of PC$^2$ can still be degraded with high-dimensional parameter spaces, limited data availability, or unrepresentative training data. To address this problem, this study explores two complementary enhancements to the PC$^2$ framework. First, a numerically efficient constrained optimization solver, straightforward updating of Lagrange multipliers (SULM), is adopted as an alternative to the conventional KKT solver. The SULM method significantly reduces computational cost when solving physically constrained problems with high-dimensionality and derivative boundary conditions that require a large number of virtual points. Second, a D-optimal sampling strategy is utilized to select informative virtual points to improve the stability and achieve the balance of accuracy and efficiency of the PC$^2$. The proposed methods are integrated into the PC$^2$ framework and evaluated through numerical examples of representative physical systems governed by ordinary or partial differential equations. The results demonstrate that the enhanced PC$^2$ has better comprehensive capability than standard PC$^2$, and is well-suited for high-dimensional uncertainty quantification tasks.

18.3MLFeb 23, 2024

Physics-constrained polynomial chaos expansion for scientific machine learning and uncertainty quantification

Himanshu Sharma, Lukáš Novák, Michael D. Shields

We present a novel physics-constrained polynomial chaos expansion as a surrogate modeling method capable of performing both scientific machine learning (SciML) and uncertainty quantification (UQ) tasks. The proposed method possesses a unique capability: it seamlessly integrates SciML into UQ and vice versa, which allows it to quantify the uncertainties in SciML tasks effectively and leverage SciML for improved uncertainty assessment during UQ-related tasks. The proposed surrogate model can effectively incorporate a variety of physical constraints, such as governing partial differential equations (PDEs) with associated initial and boundary conditions constraints, inequality-type constraints (e.g., monotonicity, convexity, non-negativity, among others), and additional a priori information in the training process to supplement limited data. This ensures physically realistic predictions and significantly reduces the need for expensive computational model evaluations to train the surrogate model. Furthermore, the proposed method has a built-in uncertainty quantification (UQ) feature to efficiently estimate output uncertainties. To demonstrate the effectiveness of the proposed method, we apply it to a diverse set of problems, including linear/non-linear PDEs with deterministic and stochastic parameters, data-driven surrogate modeling of a complex physical system, and UQ of a stochastic system with parameters modeled as random fields.

3.1MLJan 10, 2024

Reliability Analysis of Complex Systems using Subset Simulations with Hamiltonian Neural Networks

Denny Thaler, Somayajulu L. N. Dhulipala, Franz Bamer et al.

We present a new Subset Simulation approach using Hamiltonian neural network-based Monte Carlo sampling for reliability analysis. The proposed strategy combines the superior sampling of the Hamiltonian Monte Carlo method with computationally efficient gradient evaluations using Hamiltonian neural networks. This combination is especially advantageous because the neural network architecture conserves the Hamiltonian, which defines the acceptance criteria of the Hamiltonian Monte Carlo sampler. Hence, this strategy achieves high acceptance rates at low computational cost. Our approach estimates small failure probabilities using Subset Simulations. However, in low-probability sample regions, the gradient evaluation is particularly challenging. The remarkable accuracy of the proposed strategy is demonstrated on different reliability problems, and its efficiency is compared to the traditional Hamiltonian Monte Carlo method. We note that this approach can reach its limitations for gradient estimations in low-probability regions of complex and high-dimensional distributions. Thus, we propose techniques to improve gradient prediction in these particular situations and enable accurate estimations of the probability of failure. The highlight of this study is the reliability analysis of a system whose parameter distributions must be inferred with Bayesian inference problems. In such a case, the Hamiltonian Monte Carlo method requires a full model evaluation for each gradient evaluation and, therefore, comes at a very high cost. However, using Hamiltonian neural networks in this framework replaces the expensive model evaluation, resulting in tremendous improvements in computational efficiency.

12.0MLJan 30, 2024

Polynomial Chaos Expansions on Principal Geodesic Grassmannian Submanifolds for Surrogate Modeling and Uncertainty Quantification

Dimitris G. Giovanis, Dimitrios Loukrezis, Ioannis G. Kevrekidis et al.

In this work we introduce a manifold learning-based surrogate modeling framework for uncertainty quantification in high-dimensional stochastic systems. Our first goal is to perform data mining on the available simulation data to identify a set of low-dimensional (latent) descriptors that efficiently parameterize the response of the high-dimensional computational model. To this end, we employ Principal Geodesic Analysis on the Grassmann manifold of the response to identify a set of disjoint principal geodesic submanifolds, of possibly different dimension, that captures the variation in the data. Since operations on the Grassmann require the data to be concentrated, we propose an adaptive algorithm based on Riemanniann K-means and the minimization of the sample Frechet variance on the Grassmann manifold to identify "local" principal geodesic submanifolds that represent different system behavior across the parameter space. Polynomial chaos expansion is then used to construct a mapping between the random input parameters and the projection of the response on these local principal geodesic submanifolds. The method is demonstrated on four test cases, a toy-example that involves points on a hypersphere, a Lotka-Volterra dynamical system, a continuous-flow stirred-tank chemical reactor system, and a two-dimensional Rayleigh-Benard convection problem

15.7LGFeb 16, 2025

Neural Operators for Stochastic Modeling of Nonlinear Structural System Response to Natural Hazards

Somdatta Goswami, Dimitris G. Giovanis, Bowei Li et al.

Traditionally, neural networks have been employed to learn the mapping between finite-dimensional Euclidean spaces. However, recent research has opened up new horizons, focusing on the utilization of deep neural networks to learn operators capable of mapping infinite-dimensional function spaces. In this work, we employ two state-of-the-art neural operators, the deep operator network (DeepONet) and the Fourier neural operator (FNO) for the prediction of the nonlinear time history response of structural systems exposed to natural hazards, such as earthquakes and wind. Specifically, we propose two architectures, a self-adaptive FNO and a Fast Fourier Transform-based DeepONet (DeepFNOnet), where we employ a FNO beyond the DeepONet to learn the discrepancy between the ground truth and the solution predicted by the DeepONet. To demonstrate the efficiency and applicability of the architectures, two problems are considered. In the first, we use the proposed model to predict the seismic nonlinear dynamic response of a six-story shear building subject to stochastic ground motions. In the second problem, we employ the operators to predict the wind-induced nonlinear dynamic response of a high-rise building while explicitly accounting for the stochastic nature of the wind excitation. In both cases, the trained metamodels achieve high accuracy while being orders of magnitude faster than their corresponding high-fidelity models.

12.3MLAug 28, 2025

Polynomial Chaos Expansion for Operator Learning

Himanshu Sharma, Lukáš Novák, Michael D. Shields

Operator learning (OL) has emerged as a powerful tool in scientific machine learning (SciML) for approximating mappings between infinite-dimensional functional spaces. One of its main applications is learning the solution operator of partial differential equations (PDEs). While much of the progress in this area has been driven by deep neural network-based approaches such as Deep Operator Networks (DeepONet) and Fourier Neural Operator (FNO), recent work has begun to explore traditional machine learning methods for OL. In this work, we introduce polynomial chaos expansion (PCE) as an OL method. PCE has been widely used for uncertainty quantification (UQ) and has recently gained attention in the context of SciML. For OL, we establish a mathematical framework that enables PCE to approximate operators in both purely data-driven and physics-informed settings. The proposed framework reduces the task of learning the operator to solving a system of equations for the PCE coefficients. Moreover, the framework provides UQ by simply post-processing the PCE coefficients, without any additional computational cost. We apply the proposed method to a diverse set of PDE problems to demonstrate its capabilities. Numerical results demonstrate the strong performance of the proposed method in both OL and UQ tasks, achieving excellent numerical accuracy and computational efficiency.

10.3MLJul 19, 2025

Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators

Ponkrshnan Thiagarajan, Tamer A. Zaki, Michael D. Shields

Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network's parameter space and the non-convexity of their posterior distributions. Therefore, various approximation techniques, such as variational inference (VI) or stochastic gradient MCMC, are often employed to infer the posterior distribution of the network parameters. Such approximations introduce inaccuracies in the inferred distributions, resulting in unreliable uncertainty estimates. In this work, we propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently and accurately quantify uncertainties in neural networks and neural operators. The proposed approach leverages an initial VI training on the full network. We examine the influence of individual parameters on the prediction uncertainty, which shows that a large proportion of the parameters do not contribute substantially to uncertainty in the network predictions. This information is then used to significantly reduce the dimension of the parameter space, and HMC is performed only for the subset of network parameters that strongly influence prediction uncertainties. This yields a framework for accelerating the full batch HMC for posterior inference in neural networks. We demonstrate the efficiency and accuracy of the proposed framework on deep neural networks and operator networks, showing that inference can be performed for large networks with tens to hundreds of thousands of parameters. We show that this method can effectively learn surrogates for complex physical systems by modeling the operator that maps from upstream conditions to wall-pressure data on a cone in hypersonic flow.

4.5MLJul 15, 2025

Joint space-time wind field data extrapolation and uncertainty quantification using nonparametric Bayesian dictionary learning

George D. Pasparakis, Ioannis A. Kougioumtzoglou, Michael D. Shields

A methodology is developed, based on nonparametric Bayesian dictionary learning, for joint space-time wind field data extrapolation and estimation of related statistics by relying on limited/incomplete measurements. Specifically, utilizing sparse/incomplete measured data, a time-dependent optimization problem is formulated for determining the expansion coefficients of an associated low-dimensional representation of the stochastic wind field. Compared to an alternative, standard, compressive sampling treatment of the problem, the developed methodology exhibits the following advantages. First, the Bayesian formulation enables also the quantification of the uncertainty in the estimates. Second, the requirement in standard CS-based applications for an a priori selection of the expansion basis is circumvented. Instead, this is done herein in an adaptive manner based on the acquired data. Overall, the methodology exhibits enhanced extrapolation accuracy, even in cases of high-dimensional data of arbitrary form, and of relatively large extrapolation distances. Thus, it can be used, potentially, in a wide range of wind engineering applications where various constraints dictate the use of a limited number of sensors. The efficacy of the methodology is demonstrated by considering two case studies. The first relates to the extrapolation of simulated wind velocity records consistent with a prescribed joint wavenumber-frequency power spectral density in a three-dimensional domain (2D and time). The second pertains to the extrapolation of four-dimensional (3D and time) boundary layer wind tunnel experimental data that exhibit significant spatial variability and non-Gaussian characteristics.

13.1MLJun 21, 2024

Bayesian neural networks for predicting uncertainty in full-field material response

George D. Pasparakis, Lori Graham-Brady, Michael D. Shields

Stress and material deformation field predictions are among the most important tasks in computational mechanics. These predictions are typically made by solving the governing equations of continuum mechanics using finite element analysis, which can become computationally prohibitive considering complex microstructures and material behaviors. Machine learning (ML) methods offer potentially cost effective surrogates for these applications. However, existing ML surrogates are either limited to low-dimensional problems and/or do not provide uncertainty estimates in the predictions. This work proposes an ML surrogate framework for stress field prediction and uncertainty quantification for diverse materials microstructures. A modified Bayesian U-net architecture is employed to provide a data-driven image-to-image mapping from initial microstructure to stress field with prediction (epistemic) uncertainty estimates. The Bayesian posterior distributions for the U-net parameters are estimated using three state-of-the-art inference algorithms: the posterior sampling-based Hamiltonian Monte Carlo method and two variational approaches, the Monte-Carlo Dropout method and the Bayes by Backprop algorithm. A systematic comparison of the predictive accuracy and uncertainty estimates for these methods is performed for a fiber reinforced composite material and polycrystalline microstructure application. It is shown that the proposed methods yield predictions of high accuracy compared to the FEA solution, while uncertainty estimates depend on the inference approach. Generally, the Hamiltonian Monte Carlo and Bayes by Backprop methods provide consistent uncertainty estimates. Uncertainty estimates from Monte Carlo Dropout, on the other hand, are more difficult to interpret and depend strongly on the method's design.

14.1LGFeb 9, 2022

A survey of unsupervised learning methods for high-dimensional uncertainty quantification in black-box-type problems

Katiana Kontolati, Dimitrios Loukrezis, Dimitris G. Giovanis et al.

Constructing surrogate models for uncertainty quantification (UQ) on complex partial differential equations (PDEs) having inherently high-dimensional $\mathcal{O}(10^{\ge 2})$ stochastic inputs (e.g., forcing terms, boundary conditions, initial conditions) poses tremendous challenges. The curse of dimensionality can be addressed with suitable unsupervised learning techniques used as a pre-processing tool to encode inputs onto lower-dimensional subspaces while retaining its structural information and meaningful properties. In this work, we review and investigate thirteen dimension reduction methods including linear and nonlinear, spectral, blind source separation, convex and non-convex methods and utilize the resulting embeddings to construct a mapping to quantities of interest via polynomial chaos expansions (PCE). We refer to the general proposed approach as manifold PCE (m-PCE), where manifold corresponds to the latent space resulting from any of the studied dimension reduction methods. To investigate the capabilities and limitations of these methods we conduct numerical tests for three physics-based systems (treated as black-boxes) having high-dimensional stochastic inputs of varying complexity modeled as both Gaussian and non-Gaussian random fields to investigate the effect of the intrinsic dimensionality of input data. We demonstrate both the advantages and limitations of the unsupervised learning methods and we conclude that a suitable m-PCE model provides a cost-effective approach compared to alternative algorithms proposed in the literature, including recently proposed expensive deep neural network-based surrogates and can be readily applied for high-dimensional UQ in stochastic PDEs.

3.3APJan 6, 2022

Reliability Estimation of an Advanced Nuclear Fuel using Coupled Active Learning, Multifidelity Modeling, and Subset Simulation

Somayajulu L. N. Dhulipala, Michael D. Shields, Promit Chakroborty et al.

Tristructural isotropic (TRISO)-coated particle fuel is a robust nuclear fuel and determining its reliability is critical for the success of advanced nuclear technologies. However, TRISO failure probabilities are small and the associated computational models are expensive. We used coupled active learning, multifidelity modeling, and subset simulation to estimate the failure probabilities of TRISO fuels using several 1D and 2D models. With multifidelity modeling, we replaced expensive high-fidelity (HF) model evaluations with information fusion from two low-fidelity (LF) models. For the 1D TRISO models, we considered three multifidelity modeling strategies: only Kriging, Kriging LF prediction plus Kriging correction, and deep neural network (DNN) LF prediction plus Kriging correction. While the results across these multifidelity modeling strategies compared satisfactorily, strategies employing information fusion from two LF models consistently called the HF model least often. Next, for the 2D TRISO model, we considered two multifidelity modeling strategies: DNN LF prediction plus Kriging correction (data-driven) and 1D TRISO LF prediction plus Kriging correction (physics-based). The physics-based strategy, as expected, consistently required the fewest calls to the HF model. However, the data-driven strategy had a lower overall simulation time since the DNN predictions are instantaneous, and the 1D TRISO model requires a non-negligible simulation time.

3.3BIO-PHOct 29, 2021

Data-driven Uncertainty Quantification in Computational Human Head Models

Kshitiz Upadhyay, Dimitris G. Giovanis, Ahmed Alshareef et al.

Computational models of the human head are promising tools for estimating the impact-induced response of brain, and thus play an important role in the prediction of traumatic brain injury. Modern biofidelic head model simulations are associated with very high computational cost, and high-dimensional inputs and outputs, which limits the applicability of traditional uncertainty quantification (UQ) methods on these systems. In this study, a two-stage, data-driven manifold learning-based framework is proposed for UQ of computational head models. This framework is demonstrated on a 2D subject-specific head model, where the goal is to quantify uncertainty in the simulated strain fields (i.e., output), given variability in the material properties of different brain substructures (i.e., input). In the first stage, a data-driven method based on multi-dimensional Gaussian kernel-density estimation and diffusion maps is used to generate realizations of the input random vector directly from the available data. Computational simulations of a small number of realizations provide input-output pairs for training data-driven surrogate models in the second stage. The surrogate models employ nonlinear dimensionality reduction using Grassmannian diffusion maps, Gaussian process regression to create a low-cost mapping between the input random vector and the reduced solution space, and geometric harmonics models for mapping between the reduced space and the Grassmann manifold. It is demonstrated that the surrogate models provide highly accurate approximations of the computational model while significantly reducing the computational cost. Monte Carlo simulations of the surrogate models are used for uncertainty propagation. UQ of strain fields highlight significant spatial variation in model uncertainty, and reveal key differences in uncertainty among commonly used strain-based brain injury predictor variables.

10.3DATA-ANJul 21, 2021Code

Manifold learning-based polynomial chaos expansions for high-dimensional surrogate models

Katiana Kontolati, Dimitrios Loukrezis, Ketson R. M. dos Santos et al.

In this work we introduce a manifold learning-based method for uncertainty quantification (UQ) in systems describing complex spatiotemporal processes. Our first objective is to identify the embedding of a set of high-dimensional data representing quantities of interest of the computational or analytical model. For this purpose, we employ Grassmannian diffusion maps, a two-step nonlinear dimension reduction technique which allows us to reduce the dimensionality of the data and identify meaningful geometric descriptions in a parsimonious and inexpensive manner. Polynomial chaos expansion is then used to construct a mapping between the stochastic input parameters and the diffusion coordinates of the reduced space. An adaptive clustering technique is proposed to identify an optimal number of clusters of points in the latent space. The similarity of points allows us to construct a number of geometric harmonic emulators which are finally utilized as a set of inexpensive pre-trained models to perform an inverse map of realizations of latent features to the ambient space and thus perform accurate out-of-sample predictions. Thus, the proposed method acts as an encoder-decoder system which is able to automatically handle very high-dimensional data while simultaneously operating successfully in the small-data regime. The method is demonstrated on two benchmark problems and on a system of advection-diffusion-reaction equations which model a first-order chemical reaction between two species. In all test cases, the proposed method is able to achieve highly accurate approximations which ultimately lead to the significant acceleration of UQ tasks.

4.3APNov 23, 2020

Probabilistic modeling of discrete structural response with application to composite plate penetration models

Anindya Bhaduri, Christopher S. Meyer, John W. Gillespie et al.

Discrete response of structures is often a key probabilistic quantity of interest. For example, one may need to identify the probability of a binary event, such as, whether a structure has buckled or not. In this study, an adaptive domain-based decomposition and classification method, combined with sparse grid sampling, is used to develop an efficient classification surrogate modeling algorithm for such discrete outputs. An assumption of monotonic behaviour of the output with respect to all model parameters, based on the physics of the problem, helps to reduce the number of model evaluations and makes the algorithm more efficient. As an application problem, this paper deals with the development of a computational framework for generation of probabilistic penetration response of S-2 glass/SC-15 epoxy composite plates under ballistic impact. This enables the computationally feasible generation of the probabilistic velocity response (PVR) curve or the $V_0-V_{100}$ curve as a function of the impact velocity, and the ballistic limit velocity prediction as a function of the model parameters. The PVR curve incorporates the variability of the model input parameters and describes the probability of penetration of the plate as a function of impact velocity.