Zongren Zou

h-index142

21papers

1,400citations

Novelty45%

AI Score49

Ranked #47,672 of 201,326 authors (top 24%)#10,872 in LG (top 26%)

21 Papers

LGAug 25, 2022Code

NeuralUQ: A comprehensive library for uncertainty quantification in neural differential equations and operators

Zongren Zou, Xuhui Meng, Apostolos F Psaros et al.

Uncertainty quantification (UQ) in machine learning is currently drawing increasing research interest, driven by the rapid deployment of deep neural networks across different fields, such as computer vision, natural language processing, and the need for reliable tools in risk-sensitive applications. Recently, various machine learning models have also been developed to tackle problems in the field of scientific computing with applications to computational science and engineering (CSE). Physics-informed neural networks and deep operator networks are two such models for solving partial differential equations and learning operator mappings, respectively. In this regard, a comprehensive study of UQ methods tailored specifically for scientific machine learning (SciML) models has been provided in [45]. Nevertheless, and despite their theoretical merit, implementations of these methods are not straightforward, especially in large-scale CSE applications, hindering their broad adoption in both research and industry settings. In this paper, we present an open-source Python library (https://github.com/Crunch-UQ4MI), termed NeuralUQ and accompanied by an educational tutorial, for employing UQ methods for SciML in a convenient and structured manner. The library, designed for both educational and research purposes, supports multiple modern UQ methods and SciML models. It is based on a succinct workflow and facilitates flexible employment and easy extensions by the users. We first present a tutorial of NeuralUQ and subsequently demonstrate its applicability and efficiency in four diverse examples, involving dynamical systems and high-dimensional parametric and time-dependent PDEs.

LGJan 5, 2023Code

L-HYDRA: Multi-Head Physics-Informed Neural Networks

Zongren Zou, George Em Karniadakis

We introduce multi-head neural networks (MH-NNs) to physics-informed machine learning, which is a type of neural networks (NNs) with all nonlinear hidden layers as the body and multiple linear output layers as multi-head. Hence, we construct multi-head physics-informed neural networks (MH-PINNs) as a potent tool for multi-task learning (MTL), generative modeling, and few-shot learning for diverse problems in scientific machine learning (SciML). MH-PINNs connect multiple functions/tasks via a shared body as the basis functions as well as a shared distribution for the head. The former is accomplished by solving multiple tasks with MH-PINNs with each head independently corresponding to each task, while the latter by employing normalizing flows (NFs) for density estimate and generative modeling. To this end, our method is a two-stage method, and both stages can be tackled with standard deep learning tools of NNs, enabling easy implementation in practice. MH-PINNs can be used for various purposes, such as approximating stochastic processes, solving multiple tasks synergistically, providing informative prior knowledge for downstream few-shot learning tasks such as meta-learning and transfer learning, learning representative basis functions, and uncertainty quantification. We demonstrate the effectiveness of MH-PINNs in five benchmarks, investigating also the possibility of synergistic learning in regression analysis. We name the open-source code "Lernaean Hydra" (L-HYDRA), since this mythical creature possessed many heads for performing important multiple tasks, as in the proposed method.

LGMay 12, 2022

Bayesian Physics-Informed Neural Networks for real-world nonlinear dynamical systems

Kevin Linka, Amelie Schafer, Xuhui Meng et al.

Understanding real-world dynamical phenomena remains a challenging task. Across various scientific disciplines, machine learning has advanced as the go-to technology to analyze nonlinear dynamical systems, identify patterns in big data, and make decision around them. Neural networks are now consistently used as universal function approximators for data with underlying mechanisms that are incompletely understood or exceedingly complex. However, neural networks alone ignore the fundamental laws of physics and often fail to make plausible predictions. Here we integrate data, physics, and uncertainties by combining neural networks, physics-informed modeling, and Bayesian inference to improve the predictive potential of traditional neural network models. We embed the physical model of a damped harmonic oscillator into a fully-connected feed-forward neural network to explore a simple and illustrative model system, the outbreak dynamics of COVID-19. Our Physics-Informed Neural Networks can seamlessly integrate data and physics, robustly solve forward and inverse problems, and perform well for both interpolation and extrapolation, even for a small amount of noisy and incomplete data. At only minor additional cost, they can self-adaptively learn the weighting between data and physics. Combined with Bayesian Neural Networks, they can serve as priors in a Bayesian Inference, and provide credible intervals for uncertainty quantification. Our study reveals the inherent advantages and disadvantages of Neural Networks, Bayesian Inference, and a combination of both and provides valuable guidelines for model selection. While we have only demonstrated these approaches for the simple model problem of a seasonal endemic infectious disease, we anticipate that the underlying concepts and trends generalize to more complex disease conditions and, more broadly, to a wide variety of nonlinear dynamical systems.

LGOct 16, 2023

Correcting model misspecification in physics-informed neural networks (PINNs)

Zongren Zou, Xuhui Meng, George Em Karniadakis

Data-driven discovery of governing equations in computational science has emerged as a new paradigm for obtaining accurate physical models and as a possible alternative to theoretical derivations. The recently developed physics-informed neural networks (PINNs) have also been employed to learn governing equations given data across diverse scientific disciplines. Despite the effectiveness of PINNs for discovering governing equations, the physical models encoded in PINNs may be misspecified in complex systems as some of the physical processes may not be fully understood, leading to the poor accuracy of PINN predictions. In this work, we present a general approach to correct the misspecified physical models in PINNs for discovering governing equations, given some sparse and/or noisy data. Specifically, we first encode the assumed physical models, which may be misspecified, then employ other deep neural networks (DNNs) to model the discrepancy between the imperfect models and the observational data. Due to the expressivity of DNNs, the proposed method is capable of reducing the computational errors caused by the model misspecification and thus enables the applications of PINNs in complex systems where the physical processes are not exactly known. Furthermore, we utilize the Bayesian PINNs (B-PINNs) and/or ensemble PINNs to quantify uncertainties arising from noisy and/or gappy data in the discovered governing equations. A series of numerical examples including non-Newtonian channel and cavity flows demonstrate that the added DNNs are capable of correcting the model misspecification in PINNs and thus reduce the discrepancy between the physical models and the observational data. We envision that the proposed approach will extend the applications of PINNs for discovering governing equations in problems where the physico-chemical or biological processes are not well understood.

LGJul 16, 2023

Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regression

Zhen Zhang, Zongren Zou, Ellen Kuhl et al.

Misfolded tau proteins play a critical role in the progression and pathology of Alzheimer's disease. Recent studies suggest that the spatio-temporal pattern of misfolded tau follows a reaction-diffusion type equation. However, the precise mathematical model and parameters that characterize the progression of misfolded protein across the brain remain incompletely understood. Here, we use deep learning and artificial intelligence to discover a mathematical model for the progression of Alzheimer's disease using longitudinal tau positron emission tomography from the Alzheimer's Disease Neuroimaging Initiative database. Specifically, we integrate physics informed neural networks (PINNs) and symbolic regression to discover a reaction-diffusion type partial differential equation for tau protein misfolding and spreading. First, we demonstrate the potential of our model and parameter discovery on synthetic data. Then, we apply our method to discover the best model and parameters to explain tau imaging data from 46 individuals who are likely to develop Alzheimer's disease and 30 healthy controls. Our symbolic regression discovers different misfolding models $f(c)$ for two groups, with a faster misfolding for the Alzheimer's group, $f(c) = 0.23c^3 - 1.34c^2 + 1.11c$, than for the healthy control group, $f(c) = -c^3 +0.62c^2 + 0.39c$. Our results suggest that PINNs, supplemented by symbolic regression, can discover a reaction-diffusion type model to explain misfolded tau protein concentrations in Alzheimer's disease. We expect our study to be the starting point for a more holistic analysis to provide image-based technologies for early diagnosis, and ideally early treatment of neurodegeneration in Alzheimer's disease and possibly other misfolding-protein based neurodegenerative disorders.

LGNov 19, 2023

Uncertainty quantification for noisy inputs-outputs in physics-informed neural networks and neural operators

Zongren Zou, Xuhui Meng, George Em Karniadakis

Uncertainty quantification (UQ) in scientific machine learning (SciML) becomes increasingly critical as neural networks (NNs) are being widely adopted in addressing complex problems across various scientific disciplines. Representative SciML models are physics-informed neural networks (PINNs) and neural operators (NOs). While UQ in SciML has been increasingly investigated in recent years, very few works have focused on addressing the uncertainty caused by the noisy inputs, such as spatial-temporal coordinates in PINNs and input functions in NOs. The presence of noise in the inputs of the models can pose significantly more challenges compared to noise in the outputs of the models, primarily due to the inherent nonlinearity of most SciML algorithms. As a result, UQ for noisy inputs becomes a crucial factor for reliable and trustworthy deployment of these models in applications involving physical knowledge. To this end, we introduce a Bayesian approach to quantify uncertainty arising from noisy inputs-outputs in PINNs and NOs. We show that this approach can be seamlessly integrated into PINNs and NOs, when they are employed to encode the physical information. PINNs incorporate physics by including physics-informed terms via automatic differentiation, either in the loss function or the likelihood, and often take as input the spatial-temporal coordinate. Therefore, the present method equips PINNs with the capability to address problems where the observed coordinate is subject to noise. On the other hand, pretrained NOs are also commonly employed as equation-free surrogates in solving differential equations and Bayesian inverse problems, in which they take functions as inputs. The proposed approach enables them to handle noisy measurements for both input and output functions with UQ.

LGJul 30, 2024

NeuroSEM: A hybrid framework for simulating multiphysics problems by coupling PINNs and spectral elements

Khemraj Shukla, Zongren Zou, Chi Hin Chan et al.

Multiphysics problems that are characterized by complex interactions among fluid dynamics, heat transfer, structural mechanics, and electromagnetics, are inherently challenging due to their coupled nature. While experimental data on certain state variables may be available, integrating these data with numerical solvers remains a significant challenge. Physics-informed neural networks (PINNs) have shown promising results in various engineering disciplines, particularly in handling noisy data and solving inverse problems in partial differential equations (PDEs). However, their effectiveness in forecasting nonlinear phenomena in multiphysics regimes, particularly involving turbulence, is yet to be fully established. This study introduces NeuroSEM, a hybrid framework integrating PINNs with the high-fidelity Spectral Element Method (SEM) solver, Nektar++. NeuroSEM leverages the strengths of both PINNs and SEM, providing robust solutions for multiphysics problems. PINNs are trained to assimilate data and model physical phenomena in specific subdomains, which are then integrated into the Nektar++ solver. We demonstrate the efficiency and accuracy of NeuroSEM for thermal convection in cavity flow and flow past a cylinder. We applied NeuroSEM to the Rayleigh-Bénard convection system, including cases with missing thermal boundary conditions and noisy datasets, and to real particle image velocimetry (PIV) data to capture flow patterns characterized by horseshoe vortical structures. The framework's plug-and-play nature facilitates its extension to other multiphysics or multiscale problems. Furthermore, NeuroSEM is optimized for efficient execution on emerging integrated GPU-CPU architectures. This hybrid approach enhances the accuracy and efficiency of simulations, making it a powerful tool for tackling complex engineering challenges in various scientific domains.

LGAug 13, 2024

Quantification of total uncertainty in the physics-informed reconstruction of CVSim-6 physiology

Mario De Florio, Zongren Zou, Daniele E. Schiavazzi et al.

When predicting physical phenomena through simulation, quantification of the total uncertainty due to multiple sources is as crucial as making sure the underlying numerical model is accurate. Possible sources include irreducible aleatoric uncertainty due to noise in the data, epistemic uncertainty induced by insufficient data or inadequate parameterization, and model-form uncertainty related to the use of misspecified model equations. Physics-based regularization interacts in nontrivial ways with aleatoric, epistemic and model-form uncertainty and their combination, and a better understanding of this interaction is needed to improve the predictive performance of physics-informed digital twins that operate under real conditions. With a specific focus on biological and physiological models, this study investigates the decomposition of total uncertainty in the estimation of states and parameters of a differential system simulated with MC X-TFC, a new physics-informed approach for uncertainty quantification based on random projections and Monte-Carlo sampling. MC X-TFC is applied to a six-compartment stiff ODE system, the CVSim-6 model, developed in the context of human physiology. The system is analyzed by progressively removing data while estimating an increasing number of parameters and by investigating total uncertainty under model-form misspecification of non-linear resistance in the pulmonary compartment. In particular, we focus on the interaction between the formulation of the discrepancy term and quantification of model-form uncertainty, and show how additional physics can help in the estimation process. The method demonstrates robustness and efficiency in estimating unknown states and parameters, even with limited, sparse, and noisy data. It also offers great flexibility in integrating data with physics for improved estimation, even in cases of model misspecification.

LGNov 13, 2023

Leveraging Hamilton-Jacobi PDEs with time-dependent Hamiltonians for continual scientific machine learning

Paula Chen, Tingwei Meng, Zongren Zou et al.

We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (HJ PDE) with time-dependent Hamiltonian. Namely, we show that when we solve certain regularized learning problems with integral-type losses, we actually solve an optimal control problem and its associated HJ PDE with time-dependent Hamiltonian. This connection allows us to reinterpret incremental updates to learned models as the evolution of an associated HJ PDE and optimal control problem in time, where all of the previous information is intrinsically encoded in the solution to the HJ PDE. As a result, existing HJ PDE solvers and optimal control algorithms can be reused to design new efficient training approaches for SciML that naturally coincide with the continual learning framework, while avoiding catastrophic forgetting. As a first exploration of this connection, we consider the special case of linear regression and leverage our connection to develop a new Riccati-based methodology for solving these learning problems that is amenable to continual learning applications. We also provide some corresponding numerical examples that demonstrate the potential computational and memory advantages our Riccati-based approach can provide.

LGMar 22, 2023

Leveraging Multi-time Hamilton-Jacobi PDEs for Certain Scientific Machine Learning Problems

Paula Chen, Tingwei Meng, Zongren Zou et al.

Hamilton-Jacobi partial differential equations (HJ PDEs) have deep connections with a wide range of fields, including optimal control, differential games, and imaging sciences. By considering the time variable to be a higher dimensional quantity, HJ PDEs can be extended to the multi-time case. In this paper, we establish a novel theoretical connection between specific optimization problems arising in machine learning and the multi-time Hopf formula, which corresponds to a representation of the solution to certain multi-time HJ PDEs. Through this connection, we increase the interpretability of the training process of certain machine learning applications by showing that when we solve these learning problems, we also solve a multi-time HJ PDE and, by extension, its corresponding optimal control problem. As a first exploration of this connection, we develop the relation between the regularized linear regression problem and the Linear Quadratic Regulator (LQR). We then leverage our theoretical connection to adapt standard LQR solvers (namely, those based on the Riccati ordinary differential equations) to design new training approaches for machine learning. Finally, we provide some numerical examples that demonstrate the versatility and possible computational advantages of our Riccati-based approach in the context of continual learning, post-training calibration, transfer learning, and sparse dynamics identification.

LGSep 15, 2024

HJ-sampler: A Bayesian sampler for inverse problems of a stochastic process by leveraging Hamilton-Jacobi PDEs and score-based generative models

Tingwei Meng, Zongren Zou, Jérôme Darbon et al.

The interplay between stochastic processes and optimal control has been extensively explored in the literature. With the recent surge in the use of diffusion models, stochastic processes have increasingly been applied to sample generation. This paper builds on the log transform, known as the Cole-Hopf transform in Brownian motion contexts, and extends it within a more abstract framework that includes a linear operator. Within this framework, we found that the well-known relationship between the Cole-Hopf transform and optimal transport is a particular instance where the linear operator acts as the infinitesimal generator of a stochastic process. We also introduce a novel scenario where the linear operator is the adjoint of the generator, linking to Bayesian inference under specific initial and terminal conditions. Leveraging this theoretical foundation, we develop a new algorithm, named the HJ-sampler, for Bayesian inference for the inverse problem of a stochastic differential equation with given terminal observations. The HJ-sampler involves two stages: (1) solving the viscous Hamilton-Jacobi partial differential equations, and (2) sampling from the associated stochastic optimal control problem. Our proposed algorithm naturally allows for flexibility in selecting the numerical solver for viscous HJ PDEs. We introduce two variants of the solver: the Riccati-HJ-sampler, based on the Riccati method, and the SGM-HJ-sampler, which utilizes diffusion models. We demonstrate the effectiveness and flexibility of the proposed methods by applying them to solve Bayesian inverse problems involving various stochastic processes and prior distributions, including applications that address model misspecifications and quantifying model uncertainty.

LGApr 18

Uncertainty Quantification in PINNs for Turbulent Flows: Bayesian Inference and Repulsive Ensembles

Khemraj Shukla, Zongren Zou, Theo Kaeufer et al.

Physics-informed neural networks (PINNs) have emerged as a promising framework for solving inverse problems governed by partial differential equations (PDEs), including the reconstruction of turbulent flow fields from sparse data. However, most existing PINN formulations are deterministic and do not provide reliable quantification of epistemic uncertainty, which is critical for ill-posed problems such as data-driven Reynolds-averaged Navier-Stokes (RANS) modeling. In this work, we develop and systematically evaluate a set of probabilistic extensions of PINNs for uncertainty quantification in turbulence modeling. The proposed framework combines (i) Bayesian PINNs with Hamiltonian Monte Carlo sampling and a tempered multi-component likelihood, (ii) Monte Carlo dropout, and (iii) repulsive deep ensembles that enforce diversity in function space. Particular emphasis is placed on the role of ensemble diversity and likelihood tempering in improving uncertainty calibration for PDE-constrained inverse problems. The methods are assessed on a hierarchy of test cases, including the Van der Pol oscillator and turbulent flow past a circular cylinder at Reynolds numbers Re=3,900 (direct numerical simulation data) and Re = 10,000 (experimental particle image velocimetry data). The results demonstrate that Bayesian PINNs provide the most consistent uncertainty estimates across all inferred quantities, while function-space repulsive ensembles offer a computationally efficient approximation with competitive accuracy for primary flow variables. These findings provide quantitative insight into the trade-offs between accuracy, computational cost, and uncertainty calibration in physics-informed learning, and offer practical guidance for uncertainty quantification in data-driven turbulence modeling.

LGJan 19, 2022Code

Uncertainty Quantification in Scientific Machine Learning: Methods, Metrics, and Comparisons

Apostolos F Psaros, Xuhui Meng, Zongren Zou et al.

Neural networks (NNs) are currently changing the computational paradigm on how to combine data with mathematical laws in physics and engineering in a profound way, tackling challenging inverse and ill-posed problems not solvable with traditional methods. However, quantifying errors and uncertainties in NN-based inference is more complicated than in traditional methods. This is because in addition to aleatoric uncertainty associated with noisy data, there is also uncertainty due to limited data, but also due to NN hyperparameters, overparametrization, optimization and sampling errors as well as model misspecification. Although there are some recent works on uncertainty quantification (UQ) in NNs, there is no systematic investigation of suitable methods towards quantifying the total uncertainty effectively and efficiently even for function approximation, and there is even less work on solving partial differential equations and learning operator mappings between infinite-dimensional function spaces using NNs. In this work, we present a comprehensive framework that includes uncertainty modeling, new and existing solution methods, as well as evaluation metrics and post-hoc improvement approaches. To demonstrate the applicability and reliability of our framework, we present an extensive comparative study in which various methods are tested on prototype problems, including problems with mixed input-output data, and stochastic problems in high dimensions. In the Appendix, we include a comprehensive description of all the UQ methods employed, which we will make available as open-source library of all codes included in this framework.

LGOct 17, 2024

From PINNs to PIKANs: Recent Advances in Physics-Informed Machine Learning

Juan Diego Toscano, Vivek Oommen, Alan John Varghese et al.

Physics-Informed Neural Networks (PINNs) have emerged as a key tool in Scientific Machine Learning since their introduction in 2017, enabling the efficient solution of ordinary and partial differential equations using sparse measurements. Over the past few years, significant advancements have been made in the training and optimization of PINNs, covering aspects such as network architectures, adaptive refinement, domain decomposition, and the use of adaptive weights and activation functions. A notable recent development is the Physics-Informed Kolmogorov-Arnold Networks (PIKANS), which leverage a representation model originally proposed by Kolmogorov in 1957, offering a promising alternative to traditional PINNs. In this review, we provide a comprehensive overview of the latest advancements in PINNs, focusing on improvements in network design, feature expansion, optimization techniques, uncertainty quantification, and theoretical insights. We also survey key applications across a range of fields, including biomedicine, fluid and solid mechanics, geophysics, dynamical systems, heat transfer, chemical engineering, and beyond. Finally, we review computational frameworks and software tools developed by both academia and industry to support PINN research and applications.

LGMar 8, 2025

Learning and discovering multiple solutions using physics-informed neural networks with random initialization and deep ensemble

Zongren Zou, Zhicheng Wang, George Em Karniadakis

We explore the capability of physics-informed neural networks (PINNs) to discover multiple solutions. Many real-world phenomena governed by nonlinear differential equations (DEs), such as fluid flow, exhibit multiple solutions under the same conditions, yet capturing this solution multiplicity remains a significant challenge. A key difficulty is giving appropriate initial conditions or initial guesses, to which the widely used time-marching schemes and Newton's iteration method are very sensitive in finding solutions for complex computational problems. While machine learning models, particularly PINNs, have shown promise in solving DEs, their ability to capture multiple solutions remains underexplored. In this work, we propose a simple and practical approach using PINNs to learn and discover multiple solutions. We first reveal that PINNs, when combined with random initialization and deep ensemble method -- originally developed for uncertainty quantification -- can effectively uncover multiple solutions to nonlinear ordinary and partial differential equations (ODEs/PDEs). Our approach highlights the critical role of initialization in shaping solution diversity, addressing an often-overlooked aspect of machine learning for scientific computing. Furthermore, we propose utilizing PINN-generated solutions as initial conditions or initial guesses for conventional numerical solvers to enhance accuracy and efficiency in capturing multiple solutions. Extensive numerical experiments, including the Allen-Cahn equation and cavity flow, where our approach successfully identifies both stable and unstable solutions, validate the effectiveness of our method. These findings establish a general and efficient framework for addressing solution multiplicity in nonlinear differential equations.

LGApr 12, 2024

Leveraging viscous Hamilton-Jacobi PDEs for uncertainty quantification in scientific machine learning

Zongren Zou, Tingwei Meng, Paula Chen et al.

Uncertainty quantification (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models. However, two major challenges remain: limited interpretability and expensive training procedures. We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian inference problems arising in SciML and viscous Hamilton-Jacobi partial differential equations (HJ PDEs). Namely, we show that the posterior mean and covariance can be recovered from the spatial gradient and Hessian of the solution to a viscous HJ PDE. As a first exploration of this connection, we specialize to Bayesian inference problems with linear models, Gaussian likelihoods, and Gaussian priors. In this case, the associated viscous HJ PDEs can be solved using Riccati ODEs, and we develop a new Riccati-based methodology that provides computational advantages when continuously updating the model predictions. Specifically, our Riccati-based approach can efficiently add or remove data points to the training set invariant to the order of the data and continuously tune hyperparameters. Moreover, neither update requires retraining on or access to previously incorporated data. We provide several examples from SciML involving noisy data and \textit{epistemic uncertainty} to illustrate the potential advantages of our approach. In particular, this approach's amenability to data streaming applications demonstrates its potential for real-time inferences, which, in turn, allows for applications in which the predicted uncertainty is used to dynamically alter the learning process.

LGMay 20, 2024

Fast meta-solvers for 3D complex-shape scatterers using neural operators trained on a non-scattering problem

Youngkyu Lee, Shanqing Liu, Zongren Zou et al.

Three-dimensional target identification using scattering techniques requires high accuracy solutions and very fast computations for real-time predictions in some critical applications. We first train a deep neural operator~(DeepONet) to solve wave propagation problems described by the Helmholtz equation in a domain \textit{without scatterers} but at different wavenumbers and with a complex absorbing boundary condition. We then design two classes of fast meta-solvers by combining DeepONet with either relaxation methods, such as Jacobi and Gauss-Seidel, or with Krylov methods, such as GMRES and BiCGStab, using the trunk basis of DeepONet as a coarse-scale preconditioner. We leverage the spectral bias of neural networks to account for the lower part of the spectrum in the error distribution while the upper part is handled inexpensively using relaxation methods or fine-scale preconditioners. The meta-solvers are then applied to solve scattering problems with different shape of scatterers, at no extra training cost. We first demonstrate that the resulting meta-solvers are shape-agnostic, fast, and robust, whereas the standard standalone solvers may even fail to converge without the DeepONet. We then apply both classes of meta-solvers to scattering from a submarine, a complex three-dimensional problem. We achieve very fast solutions, especially with the DeepONet-Krylov methods, which require orders of magnitude fewer iterations than any of the standalone solvers.

CEDec 15, 2025

Probabilistic Predictions of Process-Induced Deformation in Carbon/Epoxy Composites Using a Deep Operator Network

Elham Kiyani, Amit Makarand Deshpande, Madhura Limaye et al.

Fiber reinforcement and polymer matrix respond differently to manufacturing conditions due to mismatch in coefficient of thermal expansion and matrix shrinkage during curing of thermosets. These heterogeneities generate residual stresses over multiple length scales, whose partial release leads to process-induced deformation (PID), requiring accurate prediction and mitigation via optimized non-isothermal cure cycles. This study considers a unidirectional AS4 carbon fiber/amine bi-functional epoxy prepreg and models PID using a two-mechanism framework that accounts for thermal expansion/shrinkage and cure shrinkage. The model is validated against manufacturing trials to identify initial and boundary conditions, then used to generate PID responses for a diverse set of non-isothermal cure cycles (time-temperature profiles). Building on this physics-based foundation, we develop a data-driven surrogate based on Deep Operator Networks (DeepONets). A DeepONet is trained on a dataset combining high-fidelity simulations with targeted experimental measurements of PID. We extend this to a Feature-wise Linear Modulation (FiLM) DeepONet, where branch-network features are modulated by external parameters, including the initial degree of cure, enabling prediction of time histories of degree of cure, viscosity, and deformation. Because experimental data are available only at limited time instances (for example, final deformation), we use transfer learning: simulation-trained trunk and branch networks are fixed and only the final layer is updated using measured final deformation. Finally, we augment the framework with Ensemble Kalman Inversion (EKI) to quantify uncertainty under experimental conditions and to support optimization of cure schedules for reduced PID in composites.

MLOct 7, 2025

Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes

Nicholas H. Nelsen, Houman Owhadi, Andrew M. Stuart et al.

Methods for solving scientific computing and inference problems, such as kernel- and neural network-based approaches for partial differential equations (PDEs), inverse problems, and supervised learning tasks, depend crucially on the choice of hyperparameters. Specifically, the efficacy of such methods, and in particular their accuracy, stability, and generalization properties, strongly depends on the choice of hyperparameters. While bilevel optimization offers a principled framework for hyperparameter tuning, its nested optimization structure can be computationally demanding, especially in PDE-constrained contexts. In this paper, we propose an efficient strategy for hyperparameter optimization within the bilevel framework by employing a Gauss-Newton linearization of the inner optimization step. Our approach provides closed-form updates, eliminating the need for repeated costly PDE solves. As a result, each iteration of the outer loop reduces to a single linearized PDE solve, followed by explicit gradient-based hyperparameter updates. We demonstrate the effectiveness of the proposed method through Gaussian process models applied to nonlinear PDEs and to PDE inverse problems. Extensive numerical experiments highlight substantial improvements in accuracy and robustness compared to conventional random hyperparameter initialization. In particular, experiments with additive kernels and neural network-parameterized deep kernels demonstrate the method's scalability and effectiveness for high-dimensional hyperparameter optimization.

LGJun 5, 2024

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang et al.

Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator networks (DeepONets), which are based on the standard MLP representation. We find that although the original KANs based on the B-splines parameterization lack accuracy and efficiency, modified versions based on low-order orthogonal polynomials have comparable performance to PINNs and DeepONet although they still lack robustness as they may diverge for different random seeds or higher order orthogonal polynomials. We visualize their corresponding loss landscapes and analyze their learning dynamics using information bottleneck theory. Our study follows the FAIR principles so that other researchers can use our benchmarks to further advance this emerging topic.

LGMay 4, 2023

A Generative Modeling Framework for Inferring Families of Biomechanical Constitutive Laws in Data-Sparse Regimes

Minglang Yin, Zongren Zou, Enrui Zhang et al.

Quantifying biomechanical properties of the human vasculature could deepen our understanding of cardiovascular diseases. Standard nonlinear regression in constitutive modeling requires considerable high-quality data and an explicit form of the constitutive model as prior knowledge. By contrast, we propose a novel approach that combines generative deep learning with Bayesian inference to efficiently infer families of constitutive relationships in data-sparse regimes. Inspired by the concept of functional priors, we develop a generative adversarial network (GAN) that incorporates a neural operator as the generator and a fully-connected neural network as the discriminator. The generator takes a vector of noise conditioned on measurement data as input and yields the predicted constitutive relationship, which is scrutinized by the discriminator in the following step. We demonstrate that this framework can accurately estimate means and standard deviations of the constitutive relationships of the murine aorta using data collected either from model-generated synthetic data or ex vivo experiments for mice with genetic deficiencies. In addition, the framework learns priors of constitutive models without explicitly knowing their functional form, providing a new model-agnostic approach to learning hidden constitutive behaviors from data.