CEJun 4
Physics-constrained Gaussian Processes for Predicting Shockwave Hugoniot CurvesGeorge D. Pasparakis, Himanshu Sharma, Rushik Desai et al.
A physics-constrained Gaussian Process regression framework is developed for predicting shocked material states and their associated uncertainties along the Hugoniot curve using data from a small number of shockwave simulations. The proposed Gaussian process is constrained by the Rankine-Hugoniot jump conditions between the various shocked material states to construct a thermodynamically consistent covariance function. This leads to the formulation of an optimization problem over a small number of interpretable hyperparameters and enables the identification of regime transitions, from a leading elastic wave to trailing plastic and phase transformation waves. Shock Hugoniots are an important measure for understanding material behavior under extreme conditions, including for the development of equations of state and determining material properties such as the Hugoniot Elastic Limit, but they are costly to generate through large-scale molecular dynamics simulations or shock experiments. Under these constraints, the proposed methodology establishes Hugoniot curves from a limited number of molecular dynamics simulations. We consider silicon carbide as a representative material and Molecular Dynamics simulations are performed using a reverse ballistic approach. The framework reproduces the Hugoniot curve with satisfactory accuracy while also quantifying the uncertainty in the predictions using the Gaussian Process posterior. These uncertain Hugoniot predictions can then be used to calibrate equation of state models, estimate material properties, or inform future experimental and/or simulation campaigns.
LGApr 20, 2022
Deep transfer operator learning for partial differential equations under conditional shiftSomdatta Goswami, Katiana Kontolati, Michael D. Shields et al.
Transfer learning (TL) enables the transfer of knowledge gained in learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labeling, potential computational power limitations, and dataset distribution mismatches. We propose a new TL framework for task-specific learning (functional regression in partial differential equations (PDEs)) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of target data. Inspired by the conditional embedding operator theory, we minimize the statistical distance between labeled target data and the surrogate prediction on unlabeled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various TL scenarios involving nonlinear PDEs under diverse conditions due to shift in the geometric domain and model dynamics. Our TL framework enables fast and efficient learning of heterogeneous tasks despite significant differences between the source and target domains.
LGMar 9, 2022
On the influence of over-parameterization in manifold based surrogates and deep neural operatorsKatiana Kontolati, Somdatta Goswami, Michael D. Shields et al.
Constructing accurate and generalizable approximators for complex physico-chemical processes exhibiting highly non-smooth dynamics is challenging. In this work, we propose new developments and perform comparisons for two promising approaches: manifold-based polynomial chaos expansion (m-PCE) and the deep neural operator (DeepONet), and we examine the effect of over-parameterization on generalization. We demonstrate the performance of these methods in terms of generalization accuracy by solving the 2D time-dependent Brusselator reaction-diffusion system with uncertainty sources, modeling an autocatalytic chemical reaction between two species. We first propose an extension of the m-PCE by constructing a mapping between latent spaces formed by two separate embeddings of input functions and output QoIs. To enhance the accuracy of the DeepONet, we introduce weight self-adaptivity in the loss function. We demonstrate that the performance of m-PCE and DeepONet is comparable for cases of relatively smooth input-output mappings. However, when highly non-smooth dynamics is considered, DeepONet shows higher accuracy. We also find that for m-PCE, modest over-parameterization leads to better generalization, both within and outside of distribution, whereas aggressive over-parameterization leads to over-fitting. In contrast, an even highly over-parameterized DeepONet leads to better generalization for both smooth and non-smooth dynamics. Furthermore, we compare the performance of the above models with another operator learning model, the Fourier Neural Operator, and show that its over-parameterization also leads to better generalization. Our studies show that m-PCE can provide very good accuracy at very low training cost, whereas a highly over-parameterized DeepONet can provide better accuracy and robustness to noise but at higher training cost. In both methods, the inference cost is negligible.
LGAug 5, 2024
Synergistic Learning with Multi-Task DeepONet for Efficient PDE Problem SolvingVarun Kumar, Somdatta Goswami, Katiana Kontolati et al.
Multi-task learning (MTL) is an inductive transfer mechanism designed to leverage useful information from multiple tasks to improve generalization performance compared to single-task learning. It has been extensively explored in traditional machine learning to address issues such as data sparsity and overfitting in neural networks. In this work, we apply MTL to problems in science and engineering governed by partial differential equations (PDEs). However, implementing MTL in this context is complex, as it requires task-specific modifications to accommodate various scenarios representing different physical processes. To this end, we present a multi-task deep operator network (MT-DeepONet) to learn solutions across various functional forms of source terms in a PDE and multiple geometries in a single concurrent training session. We introduce modifications in the branch network of the vanilla DeepONet to account for various functional forms of a parameterized coefficient in a PDE. Additionally, we handle parameterized geometries by introducing a binary mask in the branch network and incorporating it into the loss term to improve convergence and generalization to new geometry tasks. Our approach is demonstrated on three benchmark problems: (1) learning different functional forms of the source term in the Fisher equation; (2) learning multiple geometries in a 2D Darcy Flow problem and showcasing better transfer learning capabilities to new geometries; and (3) learning 3D parameterized geometries for a heat transfer problem and demonstrate the ability to predict on new but similar geometries. Our MT-DeepONet framework offers a novel approach to solving PDE problems in engineering and science under a unified umbrella based on synergistic learning that reduces the overall training cost for neural operators.
LGApr 15, 2023
Learning in latent spaces improves the predictive accuracy of deep neural operatorsKatiana Kontolati, Somdatta Goswami, George Em Karniadakis et al.
Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.
NASep 14, 2017
Stochastic collocation approach with adaptive mesh refinement for parametric uncertainty analysisAnindya Bhaduri, Yanyan He, Michael D. Shields et al.
Presence of a high-dimensional stochastic parameter space with discontinuities poses major computational challenges in analyzing and quantifying the effects of the uncertainties in a physical system. In this paper, we propose a stochastic collocation method with adaptive mesh refinement (SCAMR) to deal with high dimensional stochastic systems with discontinuities. Specifically, the proposed approach uses generalized polynomial chaos (gPC) expansion with Legendre polynomial basis and solves for the gPC coefficients using the least squares method. It also implements an adaptive mesh (element) refinement strategy which checks for abrupt variations in the output based on the second order gPC approximation error to track discontinuities or non-smoothness. In addition, the proposed method involves a criterion for checking possible dimensionality reduction and consequently, the decomposition of the full-dimensional problem to a number of lower-dimensional subproblems. Specifically, this criterion checks all the existing interactions between input dimensions of a specific problem based on the high-dimensional model representation (HDMR) method, and therefore automatically provides the subproblems which only involve interacting dimensions. The efficiency of the approach is demonstrated using both smooth and non-smooth function examples with input dimensions up to 300, and the approach is compared against other existing algorithms.
LGAug 27, 2023
On Active Learning for Gaussian Process-based Global Sensitivity AnalysisMohit Chauhan, Mariel Ojeda-Tuz, Ryan Catarelli et al.
This paper explores the application of active learning strategies to adaptively learn Sobol indices for global sensitivity analysis. We demonstrate that active learning for Sobol indices poses unique challenges due to the definition of the Sobol index as a ratio of variances estimated from Gaussian process surrogates. Consequently, learning strategies must either focus on convergence in the numerator or the denominator of this ratio. However, rapid convergence in either one does not guarantee convergence in the Sobol index. We propose a novel strategy for active learning that focuses on resolving the main effects of the Gaussian process (associated with the numerator of the Sobol index) and compare this with existing strategies based on convergence in the total variance (the denominator of the Sobol index). The new strategy, implemented through a new learning function termed the MUSIC (minimize uncertainty in Sobol index convergence), generally converges in Sobol index error more rapidly than the existing strategies based on the Expected Improvement for Global Fit (EIGF) and the Variance Improvement for Global Fit (VIGF). Both strategies are compared with simple sequential random sampling and the MUSIC learning function generally converges most rapidly for low-dimensional problems. However, for high-dimensional problems, the performance is comparable to random sampling. The new learning strategy is demonstrated for a practical case of adaptive experimental design for large-scale Boundary Layer Wind Tunnel experiments.
LGJul 17, 2024
A Resolution Independent Neural OperatorBahador Bahmani, Somdatta Goswami, Ioannis G. Kevrekidis et al.
The Deep Operator Network (DeepONet) is a powerful neural operator architecture that uses two neural networks to map between infinite-dimensional function spaces. This architecture allows for the evaluation of the solution field at any location within the domain but requires input functions to be discretized at identical locations, limiting practical applications. We introduce a general framework for operator learning from input-output data with arbitrary sensor locations and counts. This begins by introducing a resolution-independent DeepONet (RI-DeepONet), which handles input functions discretized arbitrarily but sufficiently finely. To achieve this, we propose two dictionary learning algorithms that adaptively learn continuous basis functions, parameterized as implicit neural representations (INRs), from correlated signals on arbitrary point clouds. These basis functions project input function data onto a finite-dimensional embedding space, making it compatible with DeepONet without architectural changes. We specifically use sinusoidal representation networks (SIRENs) as trainable INR basis functions. Similarly, the dictionary learning algorithms identify basis functions for output data, defining a new neural operator architecture: the Resolution Independent Neural Operator (RINO). In RINO, the operator learning task reduces to mapping coefficients of input basis functions to output basis functions. We demonstrate RINO's robustness and applicability in handling arbitrarily sampled input and output functions during both training and inference through several numerical examples.
LGJan 31, 2023
Active Learning-based Domain Adaptive Localized Polynomial Chaos ExpansionLukáš Novák, Michael D. Shields, Václav Sadílek et al.
The paper presents a novel methodology to build surrogate models of complicated functions by an active learning-based sequential decomposition of the input random space and construction of localized polynomial chaos expansions, referred to as domain adaptive localized polynomial chaos expansion (DAL-PCE). The approach utilizes sequential decomposition of the input random space into smaller sub-domains approximated by low-order polynomial expansions. This allows approximation of functions with strong nonlinearties, discontinuities, and/or singularities. Decomposition of the input random space and local approximations alleviates the Gibbs phenomenon for these types of problems and confines error to a very small vicinity near the non-linearity. The global behavior of the surrogate model is therefore significantly better than existing methods as shown in numerical examples. The whole process is driven by an active learning routine that uses the recently proposed $Θ$ criterion to assess local variance contributions. The proposed approach balances both \emph{exploitation} of the surrogate model and \emph{exploration} of the input random space and thus leads to efficient and accurate approximation of the original mathematical model. The numerical results show the superiority of the DAL-PCE in comparison to (i) a single global polynomial chaos expansion and (ii) the recently proposed stochastic spectral embedding (SSE) method developed as an accurate surrogate model and which is based on a similar domain decomposition process. This method represents general framework upon which further extensions and refinements can be based, and which can be combined with any technique for non-intrusive polynomial chaos expansion construction.
LGDec 7, 2022
General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulationPromit Chakroborty, Somayajulu L. N. Dhulipala, Yifeng Che et al.
Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity surrogate modeling strategy in which the multi-fidelity surrogate is assembled using an active learning strategy using an on-the-fly model adequacy assessment set within a subset simulation framework for efficient reliability analysis. The multi-fidelity surrogate is assembled by first applying a Gaussian process correction to each low-fidelity model and assigning a model probability based on the model's local predictive accuracy and cost. Three strategies are proposed to fuse these individual surrogates into an overall surrogate model based on model averaging and deterministic/stochastic model selection. The strategies also dictate which model evaluations are necessary. No assumptions are made about the relationships between low-fidelity models, while the high-fidelity model is assumed to be the most accurate and most computationally expensive model. Through two analytical and two numerical case studies, including a case study evaluating the failure probability of Tristructural isotropic-coated (TRISO) nuclear fuels, the algorithm is shown to be highly accurate while drastically reducing the number of high-fidelity model calls (and hence computational cost).
LGAug 12, 2022
Bayesian Inference with Latent Hamiltonian Neural NetworksSomayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields
When sampling for Bayesian inference, one popular approach is to use Hamiltonian Monte Carlo (HMC) and specifically the No-U-Turn Sampler (NUTS) which automatically decides the end time of the Hamiltonian trajectory. However, HMC and NUTS can require numerous numerical gradients of the target density, and can prove slow in practice. We propose Hamiltonian neural networks (HNNs) with HMC and NUTS for solving Bayesian inference problems. Once trained, HNNs do not require numerical gradients of the target density during sampling. Moreover, they satisfy important properties such as perfect time reversibility and Hamiltonian conservation, making them well-suited for use within HMC and NUTS because stationarity can be shown. We also propose an HNN extension called latent HNNs (L-HNNs), which are capable of predicting latent variable outputs. Compared to HNNs, L-HNNs offer improved expressivity and reduced integration errors. Finally, we employ L-HNNs in NUTS with an online error monitoring scheme to prevent sample degeneracy in regions of low probability density. We demonstrate L-HNNs in NUTS with online error monitoring on several examples involving complex, heavy-tailed, and high-local-curvature probability densities. Overall, L-HNNs in NUTS with online error monitoring satisfactorily inferred these probability densities. Compared to traditional NUTS, L-HNNs in NUTS with online error monitoring required 1--2 orders of magnitude fewer numerical gradients of the target density and improved the effective sample size (ESS) per gradient by an order of magnitude.
MLSep 19, 2022
Physics-Informed Machine Learning of Dynamical Systems for Efficient Bayesian InferenceSomayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields
Although the no-u-turn sampler (NUTS) is a widely adopted method for performing Bayesian inference, it requires numerous posterior gradients which can be expensive to compute in practice. Recently, there has been a significant interest in physics-based machine learning of dynamical (or Hamiltonian) systems and Hamiltonian neural networks (HNNs) is a noteworthy architecture. But these types of architectures have not been applied to solve Bayesian inference problems efficiently. We propose the use of HNNs for performing Bayesian inference efficiently without requiring numerous posterior gradients. We introduce latent variable outputs to HNNs (L-HNNs) for improved expressivity and reduced integration errors. We integrate L-HNNs in NUTS and further propose an online error monitoring scheme to prevent sampling degeneracy in regions where L-HNNs may have little training data. We demonstrate L-HNNs in NUTS with online error monitoring considering several complex high-dimensional posterior densities and compare its performance to NUTS.
LGSep 4, 2023
Physics-Informed Polynomial Chaos ExpansionsLukáš Novák, Himanshu Sharma, Michael D. Shields
Surrogate modeling of costly mathematical models representing physical systems is challenging since it is typically not possible to create a large experimental design. Thus, it is beneficial to constrain the approximation to adhere to the known physics of the model. This paper presents a novel methodology for the construction of physics-informed polynomial chaos expansions (PCE) that combines the conventional experimental design with additional constraints from the physics of the model. Physical constraints investigated in this paper are represented by a set of differential equations and specified boundary conditions. A computationally efficient means for construction of physically constrained PCE is proposed and compared to standard sparse PCE. It is shown that the proposed algorithms lead to superior accuracy of the approximation and does not add significant computational burden. Although the main purpose of the proposed method lies in combining data and physical constraints, we show that physically constrained PCEs can be constructed from differential equations and boundary conditions alone without requiring evaluations of the original model. We further show that the constrained PCEs can be easily applied for uncertainty quantification through analytical post-processing of a reduced PCE filtering out the influence of all deterministic space-time variables. Several deterministic examples of increasing complexity are provided and the proposed method is applied for uncertainty quantification.
MLDec 11, 2025
Physics-informed Polynomial Chaos Expansion with Enhanced Constrained Optimization Solver and D-optimal SamplingQitian Lu, Himanshu Sharma, Michael D. Shields et al.
Physics-informed polynomial chaos expansions (PC$^2$) provide an efficient physically constrained surrogate modeling framework by embedding governing equations and other physical constraints into the standard data-driven polynomial chaos expansions (PCE) and solving via the Karush-Kuhn-Tucker (KKT) conditions. This approach improves the physical interpretability of surrogate models while achieving high computational efficiency and accuracy. However, the performance and efficiency of PC$^2$ can still be degraded with high-dimensional parameter spaces, limited data availability, or unrepresentative training data. To address this problem, this study explores two complementary enhancements to the PC$^2$ framework. First, a numerically efficient constrained optimization solver, straightforward updating of Lagrange multipliers (SULM), is adopted as an alternative to the conventional KKT solver. The SULM method significantly reduces computational cost when solving physically constrained problems with high-dimensionality and derivative boundary conditions that require a large number of virtual points. Second, a D-optimal sampling strategy is utilized to select informative virtual points to improve the stability and achieve the balance of accuracy and efficiency of the PC$^2$. The proposed methods are integrated into the PC$^2$ framework and evaluated through numerical examples of representative physical systems governed by ordinary or partial differential equations. The results demonstrate that the enhanced PC$^2$ has better comprehensive capability than standard PC$^2$, and is well-suited for high-dimensional uncertainty quantification tasks.
MLFeb 23, 2024
Physics-constrained polynomial chaos expansion for scientific machine learning and uncertainty quantificationHimanshu Sharma, Lukáš Novák, Michael D. Shields
We present a novel physics-constrained polynomial chaos expansion as a surrogate modeling method capable of performing both scientific machine learning (SciML) and uncertainty quantification (UQ) tasks. The proposed method possesses a unique capability: it seamlessly integrates SciML into UQ and vice versa, which allows it to quantify the uncertainties in SciML tasks effectively and leverage SciML for improved uncertainty assessment during UQ-related tasks. The proposed surrogate model can effectively incorporate a variety of physical constraints, such as governing partial differential equations (PDEs) with associated initial and boundary conditions constraints, inequality-type constraints (e.g., monotonicity, convexity, non-negativity, among others), and additional a priori information in the training process to supplement limited data. This ensures physically realistic predictions and significantly reduces the need for expensive computational model evaluations to train the surrogate model. Furthermore, the proposed method has a built-in uncertainty quantification (UQ) feature to efficiently estimate output uncertainties. To demonstrate the effectiveness of the proposed method, we apply it to a diverse set of problems, including linear/non-linear PDEs with deterministic and stochastic parameters, data-driven surrogate modeling of a complex physical system, and UQ of a stochastic system with parameters modeled as random fields.
MLJan 10, 2024
Reliability Analysis of Complex Systems using Subset Simulations with Hamiltonian Neural NetworksDenny Thaler, Somayajulu L. N. Dhulipala, Franz Bamer et al.
We present a new Subset Simulation approach using Hamiltonian neural network-based Monte Carlo sampling for reliability analysis. The proposed strategy combines the superior sampling of the Hamiltonian Monte Carlo method with computationally efficient gradient evaluations using Hamiltonian neural networks. This combination is especially advantageous because the neural network architecture conserves the Hamiltonian, which defines the acceptance criteria of the Hamiltonian Monte Carlo sampler. Hence, this strategy achieves high acceptance rates at low computational cost. Our approach estimates small failure probabilities using Subset Simulations. However, in low-probability sample regions, the gradient evaluation is particularly challenging. The remarkable accuracy of the proposed strategy is demonstrated on different reliability problems, and its efficiency is compared to the traditional Hamiltonian Monte Carlo method. We note that this approach can reach its limitations for gradient estimations in low-probability regions of complex and high-dimensional distributions. Thus, we propose techniques to improve gradient prediction in these particular situations and enable accurate estimations of the probability of failure. The highlight of this study is the reliability analysis of a system whose parameter distributions must be inferred with Bayesian inference problems. In such a case, the Hamiltonian Monte Carlo method requires a full model evaluation for each gradient evaluation and, therefore, comes at a very high cost. However, using Hamiltonian neural networks in this framework replaces the expensive model evaluation, resulting in tremendous improvements in computational efficiency.
MLJan 30, 2024
Polynomial Chaos Expansions on Principal Geodesic Grassmannian Submanifolds for Surrogate Modeling and Uncertainty QuantificationDimitris G. Giovanis, Dimitrios Loukrezis, Ioannis G. Kevrekidis et al.
In this work we introduce a manifold learning-based surrogate modeling framework for uncertainty quantification in high-dimensional stochastic systems. Our first goal is to perform data mining on the available simulation data to identify a set of low-dimensional (latent) descriptors that efficiently parameterize the response of the high-dimensional computational model. To this end, we employ Principal Geodesic Analysis on the Grassmann manifold of the response to identify a set of disjoint principal geodesic submanifolds, of possibly different dimension, that captures the variation in the data. Since operations on the Grassmann require the data to be concentrated, we propose an adaptive algorithm based on Riemanniann K-means and the minimization of the sample Frechet variance on the Grassmann manifold to identify "local" principal geodesic submanifolds that represent different system behavior across the parameter space. Polynomial chaos expansion is then used to construct a mapping between the random input parameters and the projection of the response on these local principal geodesic submanifolds. The method is demonstrated on four test cases, a toy-example that involves points on a hypersphere, a Lotka-Volterra dynamical system, a continuous-flow stirred-tank chemical reactor system, and a two-dimensional Rayleigh-Benard convection problem
LGFeb 16, 2025
Neural Operators for Stochastic Modeling of Nonlinear Structural System Response to Natural HazardsSomdatta Goswami, Dimitris G. Giovanis, Bowei Li et al.
Traditionally, neural networks have been employed to learn the mapping between finite-dimensional Euclidean spaces. However, recent research has opened up new horizons, focusing on the utilization of deep neural networks to learn operators capable of mapping infinite-dimensional function spaces. In this work, we employ two state-of-the-art neural operators, the deep operator network (DeepONet) and the Fourier neural operator (FNO) for the prediction of the nonlinear time history response of structural systems exposed to natural hazards, such as earthquakes and wind. Specifically, we propose two architectures, a self-adaptive FNO and a Fast Fourier Transform-based DeepONet (DeepFNOnet), where we employ a FNO beyond the DeepONet to learn the discrepancy between the ground truth and the solution predicted by the DeepONet. To demonstrate the efficiency and applicability of the architectures, two problems are considered. In the first, we use the proposed model to predict the seismic nonlinear dynamic response of a six-story shear building subject to stochastic ground motions. In the second problem, we employ the operators to predict the wind-induced nonlinear dynamic response of a high-rise building while explicitly accounting for the stochastic nature of the wind excitation. In both cases, the trained metamodels achieve high accuracy while being orders of magnitude faster than their corresponding high-fidelity models.
CEApr 9
$Ï-$DeepONet: A Discontinuity Capturing Neural OperatorSumanta Roy, Stephen T. Castonguay, Pratanu Roy et al.
We present $Ï-$DeepONet, a physics-informed neural operator designed to learn mappings between function spaces that may contain discontinuities or exhibit non-smooth behavior. Classical neural operators are based on the universal approximation theorem which assumes that both the operator and the functions it acts on are continuous. However, many scientific and engineering problems involve naturally discontinuous input fields as well as strong and weak discontinuities in the output fields caused by material interfaces. In $Ï$-DeepONet, discontinuities in the input are handled using multiple branch networks, while discontinuities in the output are learned through a nonlinear latent embedding of the interface. This embedding is constructed from a {\it one-hot} representation of the domain decomposition that is combined with the spatial coordinates in a modified trunk network. The outputs of the branch and trunk networks are then combined through a dot product to produce the final solution, which is trained using a physics- and interface-informed loss function. We evaluate $Ï$-DeepONet on several one- and two-dimensional benchmark problems and demonstrate that it delivers accurate and stable predictions even in the presence of strong interface-driven discontinuities.
MLAug 28, 2025
Polynomial Chaos Expansion for Operator LearningHimanshu Sharma, Lukáš Novák, Michael D. Shields
Operator learning (OL) has emerged as a powerful tool in scientific machine learning (SciML) for approximating mappings between infinite-dimensional functional spaces. One of its main applications is learning the solution operator of partial differential equations (PDEs). While much of the progress in this area has been driven by deep neural network-based approaches such as Deep Operator Networks (DeepONet) and Fourier Neural Operator (FNO), recent work has begun to explore traditional machine learning methods for OL. In this work, we introduce polynomial chaos expansion (PCE) as an OL method. PCE has been widely used for uncertainty quantification (UQ) and has recently gained attention in the context of SciML. For OL, we establish a mathematical framework that enables PCE to approximate operators in both purely data-driven and physics-informed settings. The proposed framework reduces the task of learning the operator to solving a system of equations for the PCE coefficients. Moreover, the framework provides UQ by simply post-processing the PCE coefficients, without any additional computational cost. We apply the proposed method to a diverse set of PDE problems to demonstrate its capabilities. Numerical results demonstrate the strong performance of the proposed method in both OL and UQ tasks, achieving excellent numerical accuracy and computational efficiency.
LGOct 27, 2025
A Physics-informed Multi-resolution Neural OperatorSumanta Roy, Bahador Bahmani, Ioannis G. Kevrekidis et al.
The predictive accuracy of operator learning frameworks depends on the quality and quantity of available training data (input-output function pairs), often requiring substantial amounts of high-fidelity data, which can be challenging to obtain in some real-world engineering applications. These datasets may be unevenly discretized from one realization to another, with the grid resolution varying across samples. In this study, we introduce a physics-informed operator learning approach by extending the Resolution Independent Neural Operator (RINO) framework to a fully data-free setup, addressing both challenges simultaneously. Here, the arbitrarily (but sufficiently finely) discretized input functions are projected onto a latent embedding space (i.e., a vector space of finite dimensions), using pre-trained basis functions. The operator associated with the underlying partial differential equations (PDEs) is then approximated by a simple multi-layer perceptron (MLP), which takes as input a latent code along with spatiotemporal coordinates to produce the solution in the physical space. The PDEs are enforced via a finite difference solver in the physical space. The validation and performance of the proposed method are benchmarked on several numerical examples with multi-resolution data, where input functions are sampled at varying resolutions, including both coarse and fine discretizations.
MLJul 19, 2025
Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural OperatorsPonkrshnan Thiagarajan, Tamer A. Zaki, Michael D. Shields
Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network's parameter space and the non-convexity of their posterior distributions. Therefore, various approximation techniques, such as variational inference (VI) or stochastic gradient MCMC, are often employed to infer the posterior distribution of the network parameters. Such approximations introduce inaccuracies in the inferred distributions, resulting in unreliable uncertainty estimates. In this work, we propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently and accurately quantify uncertainties in neural networks and neural operators. The proposed approach leverages an initial VI training on the full network. We examine the influence of individual parameters on the prediction uncertainty, which shows that a large proportion of the parameters do not contribute substantially to uncertainty in the network predictions. This information is then used to significantly reduce the dimension of the parameter space, and HMC is performed only for the subset of network parameters that strongly influence prediction uncertainties. This yields a framework for accelerating the full batch HMC for posterior inference in neural networks. We demonstrate the efficiency and accuracy of the proposed framework on deep neural networks and operator networks, showing that inference can be performed for large networks with tens to hundreds of thousands of parameters. We show that this method can effectively learn surrogates for complex physical systems by modeling the operator that maps from upstream conditions to wall-pressure data on a cone in hypersonic flow.
MLJul 15, 2025
Joint space-time wind field data extrapolation and uncertainty quantification using nonparametric Bayesian dictionary learningGeorge D. Pasparakis, Ioannis A. Kougioumtzoglou, Michael D. Shields
A methodology is developed, based on nonparametric Bayesian dictionary learning, for joint space-time wind field data extrapolation and estimation of related statistics by relying on limited/incomplete measurements. Specifically, utilizing sparse/incomplete measured data, a time-dependent optimization problem is formulated for determining the expansion coefficients of an associated low-dimensional representation of the stochastic wind field. Compared to an alternative, standard, compressive sampling treatment of the problem, the developed methodology exhibits the following advantages. First, the Bayesian formulation enables also the quantification of the uncertainty in the estimates. Second, the requirement in standard CS-based applications for an a priori selection of the expansion basis is circumvented. Instead, this is done herein in an adaptive manner based on the acquired data. Overall, the methodology exhibits enhanced extrapolation accuracy, even in cases of high-dimensional data of arbitrary form, and of relatively large extrapolation distances. Thus, it can be used, potentially, in a wide range of wind engineering applications where various constraints dictate the use of a limited number of sensors. The efficacy of the methodology is demonstrated by considering two case studies. The first relates to the extrapolation of simulated wind velocity records consistent with a prescribed joint wavenumber-frequency power spectral density in a three-dimensional domain (2D and time). The second pertains to the extrapolation of four-dimensional (3D and time) boundary layer wind tunnel experimental data that exhibit significant spatial variability and non-Gaussian characteristics.
MLJun 21, 2024
Bayesian neural networks for predicting uncertainty in full-field material responseGeorge D. Pasparakis, Lori Graham-Brady, Michael D. Shields
Stress and material deformation field predictions are among the most important tasks in computational mechanics. These predictions are typically made by solving the governing equations of continuum mechanics using finite element analysis, which can become computationally prohibitive considering complex microstructures and material behaviors. Machine learning (ML) methods offer potentially cost effective surrogates for these applications. However, existing ML surrogates are either limited to low-dimensional problems and/or do not provide uncertainty estimates in the predictions. This work proposes an ML surrogate framework for stress field prediction and uncertainty quantification for diverse materials microstructures. A modified Bayesian U-net architecture is employed to provide a data-driven image-to-image mapping from initial microstructure to stress field with prediction (epistemic) uncertainty estimates. The Bayesian posterior distributions for the U-net parameters are estimated using three state-of-the-art inference algorithms: the posterior sampling-based Hamiltonian Monte Carlo method and two variational approaches, the Monte-Carlo Dropout method and the Bayes by Backprop algorithm. A systematic comparison of the predictive accuracy and uncertainty estimates for these methods is performed for a fiber reinforced composite material and polycrystalline microstructure application. It is shown that the proposed methods yield predictions of high accuracy compared to the FEA solution, while uncertainty estimates depend on the inference approach. Generally, the Hamiltonian Monte Carlo and Bayes by Backprop methods provide consistent uncertainty estimates. Uncertainty estimates from Monte Carlo Dropout, on the other hand, are more difficult to interpret and depend strongly on the method's design.
LGFeb 9, 2022
A survey of unsupervised learning methods for high-dimensional uncertainty quantification in black-box-type problemsKatiana Kontolati, Dimitrios Loukrezis, Dimitris G. Giovanis et al.
Constructing surrogate models for uncertainty quantification (UQ) on complex partial differential equations (PDEs) having inherently high-dimensional $\mathcal{O}(10^{\ge 2})$ stochastic inputs (e.g., forcing terms, boundary conditions, initial conditions) poses tremendous challenges. The curse of dimensionality can be addressed with suitable unsupervised learning techniques used as a pre-processing tool to encode inputs onto lower-dimensional subspaces while retaining its structural information and meaningful properties. In this work, we review and investigate thirteen dimension reduction methods including linear and nonlinear, spectral, blind source separation, convex and non-convex methods and utilize the resulting embeddings to construct a mapping to quantities of interest via polynomial chaos expansions (PCE). We refer to the general proposed approach as manifold PCE (m-PCE), where manifold corresponds to the latent space resulting from any of the studied dimension reduction methods. To investigate the capabilities and limitations of these methods we conduct numerical tests for three physics-based systems (treated as black-boxes) having high-dimensional stochastic inputs of varying complexity modeled as both Gaussian and non-Gaussian random fields to investigate the effect of the intrinsic dimensionality of input data. We demonstrate both the advantages and limitations of the unsupervised learning methods and we conclude that a suitable m-PCE model provides a cost-effective approach compared to alternative algorithms proposed in the literature, including recently proposed expensive deep neural network-based surrogates and can be readily applied for high-dimensional UQ in stochastic PDEs.
APJan 6, 2022
Reliability Estimation of an Advanced Nuclear Fuel using Coupled Active Learning, Multifidelity Modeling, and Subset SimulationSomayajulu L. N. Dhulipala, Michael D. Shields, Promit Chakroborty et al.
Tristructural isotropic (TRISO)-coated particle fuel is a robust nuclear fuel and determining its reliability is critical for the success of advanced nuclear technologies. However, TRISO failure probabilities are small and the associated computational models are expensive. We used coupled active learning, multifidelity modeling, and subset simulation to estimate the failure probabilities of TRISO fuels using several 1D and 2D models. With multifidelity modeling, we replaced expensive high-fidelity (HF) model evaluations with information fusion from two low-fidelity (LF) models. For the 1D TRISO models, we considered three multifidelity modeling strategies: only Kriging, Kriging LF prediction plus Kriging correction, and deep neural network (DNN) LF prediction plus Kriging correction. While the results across these multifidelity modeling strategies compared satisfactorily, strategies employing information fusion from two LF models consistently called the HF model least often. Next, for the 2D TRISO model, we considered two multifidelity modeling strategies: DNN LF prediction plus Kriging correction (data-driven) and 1D TRISO LF prediction plus Kriging correction (physics-based). The physics-based strategy, as expected, consistently required the fewest calls to the HF model. However, the data-driven strategy had a lower overall simulation time since the DNN predictions are instantaneous, and the 1D TRISO model requires a non-negligible simulation time.
BIO-PHOct 29, 2021
Data-driven Uncertainty Quantification in Computational Human Head ModelsKshitiz Upadhyay, Dimitris G. Giovanis, Ahmed Alshareef et al.
Computational models of the human head are promising tools for estimating the impact-induced response of brain, and thus play an important role in the prediction of traumatic brain injury. Modern biofidelic head model simulations are associated with very high computational cost, and high-dimensional inputs and outputs, which limits the applicability of traditional uncertainty quantification (UQ) methods on these systems. In this study, a two-stage, data-driven manifold learning-based framework is proposed for UQ of computational head models. This framework is demonstrated on a 2D subject-specific head model, where the goal is to quantify uncertainty in the simulated strain fields (i.e., output), given variability in the material properties of different brain substructures (i.e., input). In the first stage, a data-driven method based on multi-dimensional Gaussian kernel-density estimation and diffusion maps is used to generate realizations of the input random vector directly from the available data. Computational simulations of a small number of realizations provide input-output pairs for training data-driven surrogate models in the second stage. The surrogate models employ nonlinear dimensionality reduction using Grassmannian diffusion maps, Gaussian process regression to create a low-cost mapping between the input random vector and the reduced solution space, and geometric harmonics models for mapping between the reduced space and the Grassmann manifold. It is demonstrated that the surrogate models provide highly accurate approximations of the computational model while significantly reducing the computational cost. Monte Carlo simulations of the surrogate models are used for uncertainty propagation. UQ of strain fields highlight significant spatial variation in model uncertainty, and reveal key differences in uncertainty among commonly used strain-based brain injury predictor variables.
DATA-ANSep 28, 2021
Grassmannian diffusion maps based surrogate modeling via geometric harmonicsKetson R. M. dos Santos, Dimitrios G. Giovanis, Katiana Kontolati et al.
In this paper, a novel surrogate model based on the Grassmannian diffusion maps (GDMaps) and utilizing geometric harmonics is developed for predicting the response of engineering systems and complex physical phenomena. The method utilizes the GDMaps to obtain a low-dimensional representation of the underlying behavior of physical/mathematical systems with respect to uncertainties in the input parameters. Using this representation, geometric harmonics, an out-of-sample function extension technique, is employed to create a global map from the space of input parameters to a Grassmannian diffusion manifold. Geometric harmonics is also employed to locally map points on the diffusion manifold onto the tangent space of a Grassmann manifold. The exponential map is then used to project the points in the tangent space onto the Grassmann manifold, where reconstruction of the full solution is performed. The performance of the proposed surrogate modeling is verified with three examples. The first problem is a toy example used to illustrate the development of the technique. In the second example, errors associated with the various mappings employed in the technique are assessed by studying response predictions of the electric potential of a dielectric cylinder in a homogeneous electric field. The last example applies the method for uncertainty prediction in the strain field evolution in a model amorphous material using the shear transformation zone (STZ) theory of plasticity. In all examples, accurate predictions are obtained, showing that the present technique is a strong candidate for the application of uncertainty quantification in large-scale models.
DATA-ANJul 21, 2021
Manifold learning-based polynomial chaos expansions for high-dimensional surrogate modelsKatiana Kontolati, Dimitrios Loukrezis, Ketson R. M. dos Santos et al.
In this work we introduce a manifold learning-based method for uncertainty quantification (UQ) in systems describing complex spatiotemporal processes. Our first objective is to identify the embedding of a set of high-dimensional data representing quantities of interest of the computational or analytical model. For this purpose, we employ Grassmannian diffusion maps, a two-step nonlinear dimension reduction technique which allows us to reduce the dimensionality of the data and identify meaningful geometric descriptions in a parsimonious and inexpensive manner. Polynomial chaos expansion is then used to construct a mapping between the stochastic input parameters and the diffusion coordinates of the reduced space. An adaptive clustering technique is proposed to identify an optimal number of clusters of points in the latent space. The similarity of points allows us to construct a number of geometric harmonic emulators which are finally utilized as a set of inexpensive pre-trained models to perform an inverse map of realizations of latent features to the ambient space and thus perform accurate out-of-sample predictions. Thus, the proposed method acts as an encoder-decoder system which is able to automatically handle very high-dimensional data while simultaneously operating successfully in the small-data regime. The method is demonstrated on two benchmark problems and on a system of advection-diffusion-reaction equations which model a first-order chemical reaction between two species. In all test cases, the proposed method is able to achieve highly accurate approximations which ultimately lead to the significant acceleration of UQ tasks.
APNov 23, 2020
Probabilistic modeling of discrete structural response with application to composite plate penetration modelsAnindya Bhaduri, Christopher S. Meyer, John W. Gillespie et al.
Discrete response of structures is often a key probabilistic quantity of interest. For example, one may need to identify the probability of a binary event, such as, whether a structure has buckled or not. In this study, an adaptive domain-based decomposition and classification method, combined with sparse grid sampling, is used to develop an efficient classification surrogate modeling algorithm for such discrete outputs. An assumption of monotonic behaviour of the output with respect to all model parameters, based on the physics of the problem, helps to reduce the number of model evaluations and makes the algorithm more efficient. As an application problem, this paper deals with the development of a computational framework for generation of probabilistic penetration response of S-2 glass/SC-15 epoxy composite plates under ballistic impact. This enables the computationally feasible generation of the probabilistic velocity response (PVR) curve or the $V_0-V_{100}$ curve as a function of the impact velocity, and the ballistic limit velocity prediction as a function of the model parameters. The PVR curve incorporates the variability of the model input parameters and describes the probability of penetration of the plate as a function of impact velocity.