LGAug 25, 2022Code
NeuralUQ: A comprehensive library for uncertainty quantification in neural differential equations and operatorsZongren Zou, Xuhui Meng, Apostolos F Psaros et al.
Uncertainty quantification (UQ) in machine learning is currently drawing increasing research interest, driven by the rapid deployment of deep neural networks across different fields, such as computer vision, natural language processing, and the need for reliable tools in risk-sensitive applications. Recently, various machine learning models have also been developed to tackle problems in the field of scientific computing with applications to computational science and engineering (CSE). Physics-informed neural networks and deep operator networks are two such models for solving partial differential equations and learning operator mappings, respectively. In this regard, a comprehensive study of UQ methods tailored specifically for scientific machine learning (SciML) models has been provided in [45]. Nevertheless, and despite their theoretical merit, implementations of these methods are not straightforward, especially in large-scale CSE applications, hindering their broad adoption in both research and industry settings. In this paper, we present an open-source Python library (https://github.com/Crunch-UQ4MI), termed NeuralUQ and accompanied by an educational tutorial, for employing UQ methods for SciML in a convenient and structured manner. The library, designed for both educational and research purposes, supports multiple modern UQ methods and SciML models. It is based on a succinct workflow and facilitates flexible employment and easy extensions by the users. We first present a tutorial of NeuralUQ and subsequently demonstrate its applicability and efficiency in four diverse examples, involving dynamical systems and high-dimensional parametric and time-dependent PDEs.
LGJan 5, 2023Code
L-HYDRA: Multi-Head Physics-Informed Neural NetworksZongren Zou, George Em Karniadakis
We introduce multi-head neural networks (MH-NNs) to physics-informed machine learning, which is a type of neural networks (NNs) with all nonlinear hidden layers as the body and multiple linear output layers as multi-head. Hence, we construct multi-head physics-informed neural networks (MH-PINNs) as a potent tool for multi-task learning (MTL), generative modeling, and few-shot learning for diverse problems in scientific machine learning (SciML). MH-PINNs connect multiple functions/tasks via a shared body as the basis functions as well as a shared distribution for the head. The former is accomplished by solving multiple tasks with MH-PINNs with each head independently corresponding to each task, while the latter by employing normalizing flows (NFs) for density estimate and generative modeling. To this end, our method is a two-stage method, and both stages can be tackled with standard deep learning tools of NNs, enabling easy implementation in practice. MH-PINNs can be used for various purposes, such as approximating stochastic processes, solving multiple tasks synergistically, providing informative prior knowledge for downstream few-shot learning tasks such as meta-learning and transfer learning, learning representative basis functions, and uncertainty quantification. We demonstrate the effectiveness of MH-PINNs in five benchmarks, investigating also the possibility of synergistic learning in regression analysis. We name the open-source code "Lernaean Hydra" (L-HYDRA), since this mythical creature possessed many heads for performing important multiple tasks, as in the proposed method.
LGSep 5, 2024Code
State-space models are accurate and efficient neural operators for dynamical systemsZheyuan Hu, Nazanin Ahmadi Daryakenari, Qianli Shen et al.
Physics-informed machine learning (PIML) has emerged as a promising alternative to classical methods for predicting dynamical systems, offering faster and more generalizable solutions. However, existing models, including recurrent neural networks (RNNs), transformers, and neural operators, face challenges such as long-time integration, long-range dependencies, chaotic dynamics, and extrapolation, to name a few. To this end, this paper introduces state-space models implemented in Mamba for accurate and efficient dynamical system operator learning. Mamba addresses the limitations of existing architectures by dynamically capturing long-range dependencies and enhancing computational efficiency through reparameterization techniques. To extensively test Mamba and compare against another 11 baselines, we introduce several strict extrapolation testbeds that go beyond the standard interpolation benchmarks. We demonstrate Mamba's superior performance in both interpolation and challenging extrapolation tasks. Mamba consistently ranks among the top models while maintaining the lowest computational cost and exceptional extrapolation capabilities. Moreover, we demonstrate the good performance of Mamba for a real-world application in quantitative systems pharmacology for assessing the efficacy of drugs in tumor growth under limited data scenarios. Taken together, our findings highlight Mamba's potential as a powerful tool for advancing scientific machine learning in dynamical systems modeling. (The code will be available at https://github.com/zheyuanhu01/State_Space_Model_Neural_Operator upon acceptance.)
LGJul 23, 2023
Tackling the Curse of Dimensionality with Physics-Informed Neural NetworksZheyuan Hu, Khemraj Shukla, George Em Karniadakis et al.
The curse-of-dimensionality taxes computational resources heavily with exponentially increasing computational cost as the dimension increases. This poses great challenges in solving high-dimensional PDEs, as Richard E. Bellman first pointed out over 60 years ago. While there has been some recent success in solving numerically partial differential equations (PDEs) in high dimensions, such computations are prohibitively expensive, and true scaling of general nonlinear PDEs to high dimensions has never been achieved. We develop a new method of scaling up physics-informed neural networks (PINNs) to solve arbitrary high-dimensional PDEs. The new method, called Stochastic Dimension Gradient Descent (SDGD), decomposes a gradient of PDEs into pieces corresponding to different dimensions and randomly samples a subset of these dimensional pieces in each iteration of training PINNs. We prove theoretically the convergence and other desired properties of the proposed method. We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schrödinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach. Notably, we solve nonlinear PDEs with nontrivial, anisotropic, and inseparable solutions in 100,000 effective dimensions in 12 hours on a single GPU using SDGD with PINNs. Since SDGD is a general training methodology of PINNs, it can be applied to any current and future variants of PINNs to scale them up for arbitrary high-dimensional PDEs.
LGJul 8, 2022
Physics-Informed Deep Neural Operator NetworksSomdatta Goswami, Aniruddha Bora, Yue Yu et al.
Standard neural networks can approximate general nonlinear operators, represented either explicitly by a combination of mathematical operators, e.g., in an advection-diffusion-reaction partial differential equation, or simply as a black box, e.g., a system-of-systems. The first neural operator was the Deep Operator Network (DeepONet), proposed in 2019 based on rigorous approximation theory. Since then, a few other less general operators have been published, e.g., based on graph neural networks or Fourier transforms. For black box systems, training of neural operators is data-driven only but if the governing equations are known they can be incorporated into the loss function during training to develop physics-informed neural operators. Neural operators can be used as surrogates in design problems, uncertainty quantification, autonomous systems, and almost in any application requiring real-time inference. Moreover, independently pre-trained DeepONets can be used as components of a complex multi-physics system by coupling them together with relatively light training. Here, we present a review of DeepONet, the Fourier neural operator, and the graph neural operator, as well as appropriate extensions with feature expansions, and highlight their usefulness in diverse applications in computational mechanics, including porous media, fluid mechanics, and solid mechanics.
MTRL-SCIApr 11, 2022
Learning two-phase microstructure evolution using neural operators and autoencoder architecturesVivek Oommen, Khemraj Shukla, Somdatta Goswami et al.
Phase-field modeling is an effective but computationally expensive method for capturing the mesoscale morphological and microstructure evolution in materials. Hence, fast and generalizable surrogate models are needed to alleviate the cost of computationally taxing processes such as in optimization and design of materials. The intrinsic discontinuous nature of the physical phenomena incurred by the presence of sharp phase boundaries makes the training of the surrogate model cumbersome. We develop a framework that integrates a convolutional autoencoder architecture with a deep neural operator (DeepONet) to learn the dynamic evolution of a two-phase mixture and accelerate time-to-solution in predicting the microstructure evolution. We utilize the convolutional autoencoder to provide a compact representation of the microstructure data in a low-dimensional latent space. DeepONet, which consists of two sub-networks, one for encoding the input function at a fixed number of sensors locations (branch net) and another for encoding the locations for the output functions (trunk net), learns the mesoscale dynamics of the microstructure evolution from the autoencoder latent space. The decoder part of the convolutional autoencoder then reconstructs the time-evolved microstructure from the DeepONet predictions. The trained DeepONet architecture can then be used to replace the high-fidelity phase-field numerical solver in interpolation tasks or to accelerate the numerical solver in extrapolation tasks.
LGMay 12, 2022
Bayesian Physics-Informed Neural Networks for real-world nonlinear dynamical systemsKevin Linka, Amelie Schafer, Xuhui Meng et al.
Understanding real-world dynamical phenomena remains a challenging task. Across various scientific disciplines, machine learning has advanced as the go-to technology to analyze nonlinear dynamical systems, identify patterns in big data, and make decision around them. Neural networks are now consistently used as universal function approximators for data with underlying mechanisms that are incompletely understood or exceedingly complex. However, neural networks alone ignore the fundamental laws of physics and often fail to make plausible predictions. Here we integrate data, physics, and uncertainties by combining neural networks, physics-informed modeling, and Bayesian inference to improve the predictive potential of traditional neural network models. We embed the physical model of a damped harmonic oscillator into a fully-connected feed-forward neural network to explore a simple and illustrative model system, the outbreak dynamics of COVID-19. Our Physics-Informed Neural Networks can seamlessly integrate data and physics, robustly solve forward and inverse problems, and perform well for both interpolation and extrapolation, even for a small amount of noisy and incomplete data. At only minor additional cost, they can self-adaptively learn the weighting between data and physics. Combined with Bayesian Neural Networks, they can serve as priors in a Bayesian Inference, and provide credible intervals for uncertainty quantification. Our study reveals the inherent advantages and disadvantages of Neural Networks, Bayesian Inference, and a combination of both and provides valuable guidelines for model selection. While we have only demonstrated these approaches for the simple model problem of a seasonal endemic infectious disease, we anticipate that the underlying concepts and trends generalize to more complex disease conditions and, more broadly, to a wide variety of nonlinear dynamical systems.
LGApr 20, 2022
Deep transfer operator learning for partial differential equations under conditional shiftSomdatta Goswami, Katiana Kontolati, Michael D. Shields et al.
Transfer learning (TL) enables the transfer of knowledge gained in learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labeling, potential computational power limitations, and dataset distribution mismatches. We propose a new TL framework for task-specific learning (functional regression in partial differential equations (PDEs)) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of target data. Inspired by the conditional embedding operator theory, we minimize the statistical distance between labeled target data and the surrogate prediction on unlabeled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various TL scenarios involving nonlinear PDEs under diverse conditions due to shift in the geometric domain and model dynamics. Our TL framework enables fast and efficient learning of heterogeneous tasks despite significant differences between the source and target domains.
NAJan 4, 2017
Second-order numerical methods for multi-term fractional differential equations: Smooth and non-smooth solutionsFanhai Zeng, Zhongqiang Zhang, George Em Karniadakis
Starting with the asymptotic expansion of the error equation of the shifted Grünwald--Letnikov formula, we derive a new modified weighted shifted Grünwald--Letnikov (WSGL) formula by introducing appropriate correction terms. We then apply one special case of the modified WSGL formula to solve multi-term fractional ordinary and partial differential equations, and we prove the linear stability and second-order convergence for both smooth and non-smooth solutions. We show theoretically and numerically that numerical solutions up to certain accuracy can be obtained with only a few correction terms. Moreover, the correction terms can be tuned according to the fractional derivative orders without explicitly knowing the analytical solutions. Numerical simulations verify the theoretical results and demonstrate that the new formula leads to better performance compared to other known numerical approximations with similar resolution.
LGDec 13, 2022
Reliable extrapolation of deep neural operators informed by physics or sparse observationsMin Zhu, Handi Zhang, Anran Jiao et al.
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning models, in general, are usually limited to interpolation scenarios, where new predictions utilize inputs within the support of the training set. However, in the inference stage of real-world applications, the input may lie outside the support, i.e., extrapolation is required, which may result to large errors and unavoidable failure of deep learning models. Here, we address this challenge of extrapolation for deep neural operators. First, we systematically investigate the extrapolation behavior of DeepONets by quantifying the extrapolation complexity via the 2-Wasserstein distance between two function spaces and propose a new behavior of bias-variance trade-off for extrapolation with respect to model capacity. Subsequently, we develop a complete workflow, including extrapolation determination, and we propose five reliable learning methods that guarantee a safe prediction under extrapolation by requiring additional information -- the governing PDEs of the system or sparse new observations. The proposed methods are based on either fine-tuning a pre-trained DeepONet or multifidelity learning. We demonstrate the effectiveness of the proposed framework for various types of parametric PDEs. Our systematic comparisons provide practical guidelines for selecting a proper extrapolation method depending on the available information, desired accuracy, and required inference speed.
LGNov 16, 2022
Augmented Physics-Informed Neural Networks (APINNs): A gating network-based soft domain decomposition methodologyZheyuan Hu, Ameya D. Jagtap, George Em Karniadakis et al.
In this paper, we propose the augmented physics-informed neural network (APINN), which adopts soft and trainable domain decomposition and flexible parameter sharing to further improve the extended PINN (XPINN) as well as the vanilla PINN methods. In particular, a trainable gate network is employed to mimic the hard decomposition of XPINN, which can be flexibly fine-tuned for discovering a potentially better partition. It weight-averages several sub-nets as the output of APINN. APINN does not require complex interface conditions, and its sub-nets can take advantage of all training samples rather than just part of the training data in their subdomains. Lastly, each sub-net shares part of the common parameters to capture the similar components in each decomposed function. Furthermore, following the PINN generalization theory in Hu et al. [2021], we show that APINN can improve generalization by proper gate network initialization and general domain & function decomposition. Extensive experiments on different types of PDEs demonstrate how APINN improves the PINN and XPINN methods. Specifically, we present examples where XPINN performs similarly to or worse than PINN, so that APINN can significantly improve both. We also show cases where XPINN is already better than PINN, so APINN can still slightly improve XPINN. Furthermore, we visualize the optimized gating networks and their optimization trajectories, and connect them with their performance, which helps discover the possibly optimal decomposition. Interestingly, if initialized by different decomposition, the performances of corresponding APINNs can differ drastically. This, in turn, shows the potential to design an optimal domain decomposition for the differential equation problem under consideration.
LGSep 6, 2022
How important are activation functions in regression and classification? A survey, performance comparison, and future directionsAmeya D. Jagtap, George Em Karniadakis
Inspired by biological neurons, the activation functions play an essential part in the learning process of any artificial neural network commonly used in many real-world problems. Various activation functions have been proposed in the literature for classification as well as regression tasks. In this work, we survey the activation functions that have been employed in the past as well as the current state-of-the-art. In particular, we present various developments in activation functions over the years and the advantages as well as disadvantages or limitations of these activation functions. We also discuss classical (fixed) activation functions, including rectifier units, and adaptive activation functions. In addition to discussing the taxonomy of activation functions based on characterization, a taxonomy of activation functions based on applications is presented. To this end, the systematic comparison of various fixed and adaptive activation functions is performed for classification data sets such as the MNIST, CIFAR-10, and CIFAR- 100. In recent years, a physics-informed machine learning framework has emerged for solving problems related to scientific computations. For this purpose, we also discuss various requirements for activation functions that have been used in the physics-informed machine learning framework. Furthermore, various comparisons are made among different fixed and adaptive activation functions using various machine learning libraries such as TensorFlow, Pytorch, and JAX.
COMP-PHFeb 28, 2023
A unified scalable framework for causal sweeping strategies for Physics-Informed Neural Networks (PINNs) and their temporal decompositionsMichael Penwarden, Ameya D. Jagtap, Shandian Zhe et al.
Physics-informed neural networks (PINNs) as a means of solving partial differential equations (PDE) have garnered much attention in the Computational Science and Engineering (CS&E) world. However, a recent topic of interest is exploring various training (i.e., optimization) challenges - in particular, arriving at poor local minima in the optimization landscape results in a PINN approximation giving an inferior, and sometimes trivial, solution when solving forward time-dependent PDEs with no data. This problem is also found in, and in some sense more difficult, with domain decomposition strategies such as temporal decomposition using XPINNs. We furnish examples and explanations for different training challenges, their cause, and how they relate to information propagation and temporal decomposition. We then propose a new stacked-decomposition method that bridges the gap between time-marching PINNs and XPINNs. We also introduce significant computational speed-ups by using transfer learning concepts to initialize subnetworks in the domain and loss tolerance-based propagation for the subdomains. Finally, we formulate a new time-sweeping collocation point algorithm inspired by the previous PINNs causality literature, which our framework can still describe, and provides a significant computational speed-up via reduced-cost collocation point segmentation. The proposed methods form our unified framework, which overcomes training challenges in PINNs and XPINNs for time-dependent PDEs by respecting the causality in multiple forms and improving scalability by limiting the computation required per optimization iteration. Finally, we provide numerical results for these methods on baseline PDE problems for which unmodified PINNs and XPINNs struggle to train.
NAApr 28, 2016
Petrov-Galerkin and Spectral Collocation Methods for distributed Order Differential EquationsEhsan Kharazmi, Mohsen Zayernouri, George Em Karniadakis
Distributed order fractional operators offer a rigorous tool for mathematical modelling of multi-physics phenomena, where the differential orders are distributed over a range of values rather than being just a fixed integer/fraction as it is in standard/fractional ODEs/PDEs. We develop two spectrally-accurate schemes, namely a Petrov-Galerkin spectral method and a spectral collocation method for distributed order fractional differential equations. These schemes are developed based on the fractional Sturm-Liouville eigen-problems (FSLPs). In the Petrov-Galerkin method, we employ fractional (non-polynomial) basis functions, called \textit{Jacobi poly-fractonomials}, which are the eigenfunctions of the FSLP of first kind, while, we employ another space of test functions as the span of poly-fractonomial eigenfunctions of the FSLP of second kind. We define the underlying \textit{distributed Sobolev space} and the associated norms, where we carry out the corresponding discrete stability and error analyses of the proposed scheme. In the collocation scheme, we employ fractional (non-polynomial) Lagrange interpolants satisfying the Kronecker delta property at the collocation points. Subsequently, we obtain the corresponding distributed differentiation matrices to be employed in the discretization of the strong problem. We perform systematic numerical tests to demonstrate the efficiency and conditioning of each method.
NAApr 3, 2017
Adaptive Finite Element Method for fractional differential equations using Hierarchical MatricesXuan Zhao, Xiaozhe Hu, Wei Cai et al.
A robust and fast solver for the fractional differential equation (FDEs) involving the Riesz fractional derivative is developed using an adaptive finite element method on non-uniform meshes. It is based on the utilization of hierarchical matrices ($\mathcal{H}$-Matrices) for the representation of the stiffness matrix resulting from the finite element discretization of the FDEs. We employ a geometric multigrid method for the solution of the algebraic system of equations. We combine it with an adaptive algorithm based on a posteriori error estimation to deal with general-type singularities arising in the solution of the FDEs. Through various test examples we demonstrate the efficiency of the method and the high-accuracy of the numerical solution even in the presence of singularities. The proposed technique has been verified effectively through fundamental examples including Riesz, Left/Right Riemann-Liouville fractional derivative and, furthermore, it can be readily extended to more general fractional differential equations with different boundary conditions and low-order terms. To the best of our knowledge, there are currently no other methods for FDEs that resolve singularities accurately at linear complexity as the one we propose here.
LGApr 5, 2022
Discovering and forecasting extreme events via active learning in neural operatorsEthan Pickering, Stephen Guth, George Em Karniadakis et al.
Extreme events in society and nature, such as pandemic spikes, rogue waves, or structural failures, can have catastrophic consequences. Characterizing extremes is difficult as they occur rarely, arise from seemingly benign conditions, and belong to complex and often unknown infinite-dimensional systems. Such challenges render attempts at characterizing them as moot. We address each of these difficulties by combining novel training schemes in Bayesian experimental design (BED) with an ensemble of deep neural operators (DNOs). This model-agnostic framework pairs a BED scheme that actively selects data for quantifying extreme events with an ensemble of DNOs that approximate infinite-dimensional nonlinear operators. We find that not only does this framework clearly beat Gaussian processes (GPs) but that 1) shallow ensembles of just two members perform best; 2) extremes are uncovered regardless of the state of initial data (i.e. with or without extremes); 3) our method eliminates "double-descent" phenomena; 4) the use of batches of suboptimal acquisition points compared to step-by-step global optima does not hinder BED performance; and 5) Monte Carlo acquisition outperforms standard optimizers in high-dimensions. Together these conclusions form the foundation of an AI-assisted experimental infrastructure that can efficiently infer and pinpoint critical situations across many domains, from physical to societal systems.
LGOct 16, 2023
Correcting model misspecification in physics-informed neural networks (PINNs)Zongren Zou, Xuhui Meng, George Em Karniadakis
Data-driven discovery of governing equations in computational science has emerged as a new paradigm for obtaining accurate physical models and as a possible alternative to theoretical derivations. The recently developed physics-informed neural networks (PINNs) have also been employed to learn governing equations given data across diverse scientific disciplines. Despite the effectiveness of PINNs for discovering governing equations, the physical models encoded in PINNs may be misspecified in complex systems as some of the physical processes may not be fully understood, leading to the poor accuracy of PINN predictions. In this work, we present a general approach to correct the misspecified physical models in PINNs for discovering governing equations, given some sparse and/or noisy data. Specifically, we first encode the assumed physical models, which may be misspecified, then employ other deep neural networks (DNNs) to model the discrepancy between the imperfect models and the observational data. Due to the expressivity of DNNs, the proposed method is capable of reducing the computational errors caused by the model misspecification and thus enables the applications of PINNs in complex systems where the physical processes are not exactly known. Furthermore, we utilize the Bayesian PINNs (B-PINNs) and/or ensemble PINNs to quantify uncertainties arising from noisy and/or gappy data in the discovered governing equations. A series of numerical examples including non-Newtonian channel and cavity flows demonstrate that the added DNNs are capable of correcting the model misspecification in PINNs and thus reduce the discrepancy between the physical models and the observational data. We envision that the proposed approach will extend the applications of PINNs for discovering governing equations in problems where the physico-chemical or biological processes are not well understood.
CHEM-PHFeb 23, 2023
Learning stiff chemical kinetics using extended deep neural operatorsSomdatta Goswami, Ameya D. Jagtap, Hessam Babaee et al.
We utilize neural operators to learn the solution propagator for the challenging chemical kinetics equation. Specifically, we apply the deep operator network (DeepONet) along with its extensions, such as the autoencoder-based DeepONet and the newly proposed Partition-of-Unity (PoU-) DeepONet to study a range of examples, including the ROBERS problem with three species, the POLLU problem with 25 species, pure kinetics of the syngas skeletal model for $CO/H_2$ burning, which contains 11 species and 21 reactions and finally, a temporally developing planar $CO/H_2$ jet flame (turbulent flame) using the same syngas mechanism. We have demonstrated the advantages of the proposed approach through these numerical examples. Specifically, to train the DeepONet for the syngas model, we solve the skeletal kinetic model for different initial conditions. In the first case, we parametrize the initial conditions based on equivalence ratios and initial temperature values. In the second case, we perform a direct numerical simulation of a two-dimensional temporally developing $CO/H_2$ jet flame. Then, we initialize the kinetic model by the thermochemical states visited by a subset of grid points at different time snapshots. Stiff problems are computationally expensive to solve with traditional stiff solvers. Thus, this work aims to develop a neural operator-based surrogate model to solve stiff chemical kinetics. The operator, once trained offline, can accurately integrate the thermochemical state for arbitrarily large time advancements, leading to significant computational gains compared to stiff integration schemes.
NAAug 28, 2022
Blending Neural Operators and Relaxation Methods in PDE Numerical SolversEnrui Zhang, Adar Kahana, Alena Kopaničáková et al.
Neural networks suffer from spectral bias having difficulty in representing the high frequency components of a function while relaxation methods can resolve high frequencies efficiently but stall at moderate to low frequencies. We exploit the weaknesses of the two approaches by combining them synergistically to develop a fast numerical solver of partial differential equations (PDEs) at scale. Specifically, we propose HINTS, a hybrid, iterative, numerical, and transferable solver by integrating a Deep Operator Network (DeepONet) with standard relaxation methods, leading to parallel efficiency and algorithmic scalability for a wide class of PDEs, not tractable with existing monolithic solvers. HINTS balances the convergence behavior across the spectrum of eigenmodes by utilizing the spectral bias of DeepONet, resulting in a uniform convergence rate and hence exceptional performance of the hybrid solver overall. Moreover, HINTS applies to large-scale, multidimensional systems, it is flexible with regards to discretizations, computational domain, and boundary conditions.
LGJul 16, 2023
Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regressionZhen Zhang, Zongren Zou, Ellen Kuhl et al.
Misfolded tau proteins play a critical role in the progression and pathology of Alzheimer's disease. Recent studies suggest that the spatio-temporal pattern of misfolded tau follows a reaction-diffusion type equation. However, the precise mathematical model and parameters that characterize the progression of misfolded protein across the brain remain incompletely understood. Here, we use deep learning and artificial intelligence to discover a mathematical model for the progression of Alzheimer's disease using longitudinal tau positron emission tomography from the Alzheimer's Disease Neuroimaging Initiative database. Specifically, we integrate physics informed neural networks (PINNs) and symbolic regression to discover a reaction-diffusion type partial differential equation for tau protein misfolding and spreading. First, we demonstrate the potential of our model and parameter discovery on synthetic data. Then, we apply our method to discover the best model and parameters to explain tau imaging data from 46 individuals who are likely to develop Alzheimer's disease and 30 healthy controls. Our symbolic regression discovers different misfolding models $f(c)$ for two groups, with a faster misfolding for the Alzheimer's group, $f(c) = 0.23c^3 - 1.34c^2 + 1.11c$, than for the healthy control group, $f(c) = -c^3 +0.62c^2 + 0.39c$. Our results suggest that PINNs, supplemented by symbolic regression, can discover a reaction-diffusion type model to explain misfolded tau protein concentrations in Alzheimer's disease. We expect our study to be the starting point for a more holistic analysis to provide image-based technologies for early diagnosis, and ideally early treatment of neurodegeneration in Alzheimer's disease and possibly other misfolding-protein based neurodegenerative disorders.
NADec 8, 2018
Efficient multistep methods for tempered fractional calculus: Algorithms and SimulationsLing Guo, Fanhai Zeng, Ian Turner et al.
In this work, we extend the fractional linear multistep methods in [C. Lubich, SIAM J. Math. Anal., 17 (1986), pp.704--719] to the tempered fractional integral and derivative operators in the sense that the tempered fractional derivative operator is interpreted in terms of the Hadamard finite-part integral. We develop two fast methods, Fast Method I and Fast Method II, with linear complexity to calculate the discrete convolution for the approximation of the (tempered) fractional operator. Fast Method I is based on a local approximation for the contour integral that represents the convolution weight. Fast Method II is based on a globally uniform approximation of the trapezoidal rule for the integral on the real line. Both methods are efficient, but numerical experimentation reveals that Fast Method II outperforms Fast Method I in terms of accuracy, efficiency, and coding simplicity. The memory requirement and computational cost of Fast Method II are $O(Q)$ and $O(Qn_T)$, respectively, where $n_T$ is the number of the final time steps and $Q$ is the number of quadrature points used in the trapezoidal rule. The effectiveness of the fast methods is verified through a series of numerical examples for long-time integration, including a numerical study of a fractional reaction-diffusion model.
QMSep 29, 2023
AI-Aristotle: A Physics-Informed framework for Systems Biology Gray-Box IdentificationNazanin Ahmadi Daryakenari, Mario De Florio, Khemraj Shukla et al.
Discovering mathematical equations that govern physical and biological systems from observed data is a fundamental challenge in scientific research. We present a new physics-informed framework for parameter estimation and missing physics identification (gray-box) in the field of Systems Biology. The proposed framework -- named AI-Aristotle -- combines eXtreme Theory of Functional Connections (X-TFC) domain-decomposition and Physics-Informed Neural Networks (PINNs) with symbolic regression (SR) techniques for parameter discovery and gray-box identification. We test the accuracy, speed, flexibility and robustness of AI-Aristotle based on two benchmark problems in Systems Biology: a pharmacokinetics drug absorption model, and an ultradian endocrine model for glucose-insulin interactions. We compare the two machine learning methods (X-TFC and PINNs), and moreover, we employ two different symbolic regression techniques to cross-verify our results. While the current work focuses on the performance of AI-Aristotle based on synthetic data, it can equally handle noisy experimental data and can even be used for black-box identification in just a few minutes on a laptop. More broadly, our work provides insights into the accuracy, cost, scalability, and robustness of integrating neural networks with symbolic regressors, offering a comprehensive guide for researchers tackling gray-box identification challenges in complex dynamical systems in biomedicine and beyond.
LGMar 19, 2023
LNO: Laplace Neural Operator for Solving Differential EquationsQianying Cao, Somdatta Goswami, George Em Karniadakis
We introduce the Laplace neural operator (LNO), which leverages the Laplace transform to decompose the input space. Unlike the Fourier Neural Operator (FNO), LNO can handle non-periodic signals, account for transient responses, and exhibit exponential convergence. LNO incorporates the pole-residue relationship between the input and the output space, enabling greater interpretability and improved generalization ability. Herein, we demonstrate the superior approximation accuracy of a single Laplace layer in LNO over four Fourier modules in FNO in approximating the solutions of three ODEs (Duffing oscillator, driven gravity pendulum, and Lorenz system) and three PDEs (Euler-Bernoulli beam, diffusion equation, and reaction-diffusion system). Notably, LNO outperforms FNO in capturing transient responses in undamped scenarios. For the linear Euler-Bernoulli beam and diffusion equation, LNO's exact representation of the pole-residue formulation yields significantly better results than FNO. For the nonlinear reaction-diffusion system, LNO's errors are smaller than those of FNO, demonstrating the effectiveness of using system poles and residues as network parameters for operator learning. Overall, our results suggest that LNO represents a promising new approach for learning neural operators that map functions between infinite-dimensional spaces.
LGMay 16, 2022
Scalable algorithms for physics-informed neural and graph networksKhemraj Shukla, Mengjia Xu, Nathaniel Trask et al.
Physics-informed machine learning (PIML) has emerged as a promising new approach for simulating complex physical and biological systems that are governed by complex multiscale processes for which some data are also available. In some instances, the objective is to discover part of the hidden physics from the available data, and PIML has been shown to be particularly effective for such problems for which conventional methods may fail. Unlike commercial machine learning where training of deep neural networks requires big data, in PIML big data are not available. Instead, we can train such networks from additional information obtained by employing the physical laws and evaluating them at random points in the space-time domain. Such physics-informed machine learning integrates multimodality and multifidelity data with mathematical models, and implements them using neural networks or graph networks. Here, we review some of the prevailing trends in embedding physics into machine learning, using physics-informed neural networks (PINNs) based primarily on feed-forward neural networks and automatic differentiation. For more complex systems or systems of systems and unstructured data, graph neural networks (GNNs) present some distinct advantages, and here we review how physics-informed learning can be accomplished with GNNs based on graph exterior calculus to construct differential operators; we refer to these architectures as physics-informed graph networks (PIGNs). We present representative examples for both forward and inverse problems and discuss what advances are needed to scale up PINNs, PIGNs and more broadly GNNs for large-scale engineering problems.
LGMar 9, 2022
On the influence of over-parameterization in manifold based surrogates and deep neural operatorsKatiana Kontolati, Somdatta Goswami, Michael D. Shields et al.
Constructing accurate and generalizable approximators for complex physico-chemical processes exhibiting highly non-smooth dynamics is challenging. In this work, we propose new developments and perform comparisons for two promising approaches: manifold-based polynomial chaos expansion (m-PCE) and the deep neural operator (DeepONet), and we examine the effect of over-parameterization on generalization. We demonstrate the performance of these methods in terms of generalization accuracy by solving the 2D time-dependent Brusselator reaction-diffusion system with uncertainty sources, modeling an autocatalytic chemical reaction between two species. We first propose an extension of the m-PCE by constructing a mapping between latent spaces formed by two separate embeddings of input functions and output QoIs. To enhance the accuracy of the DeepONet, we introduce weight self-adaptivity in the loss function. We demonstrate that the performance of m-PCE and DeepONet is comparable for cases of relatively smooth input-output mappings. However, when highly non-smooth dynamics is considered, DeepONet shows higher accuracy. We also find that for m-PCE, modest over-parameterization leads to better generalization, both within and outside of distribution, whereas aggressive over-parameterization leads to over-fitting. In contrast, an even highly over-parameterized DeepONet leads to better generalization for both smooth and non-smooth dynamics. Furthermore, we compare the performance of the above models with another operator learning model, the Fourier Neural Operator, and show that its over-parameterization also leads to better generalization. Our studies show that m-PCE can provide very good accuracy at very low training cost, whereas a highly over-parameterized DeepONet can provide better accuracy and robustness to noise but at higher training cost. In both methods, the inference cost is negligible.
NAJun 20, 2018
Fractional Gray-Scott Model: Well-posedness, Discretization, and SimulationsTingting Wang, Fangying Song, Hong Wang et al.
The Gray-Scott (GS) model represents the dynamics and steady state pattern formation in reaction-diffusion systems and has been extensively studied in the past. In this paper, we consider the effects of anomalous diffusion on pattern formation by introducing the fractional Laplacian into the GS model. First, we prove that the continuous solutions of the fractional GS model are unique. We then introduce the Crank-Nicolson (C-N) scheme for time discretization and weighted shifted Grünwald difference operator for spatial discretization. We perform stability analysis for the time semi-discrete numerical scheme, and furthermore, we analyze numerically the errors with benchmark solutions that show second-order convergence both in time and space. We also employ the spectral collocation method in space and C-N scheme in time to solve the GS model in order to verify the accuracy of our numerical solutions. We observe the formation of different patterns at different values of the fractional order, which are quite different than the patterns of the corresponding integer-order GS model, and quantify them by using the radial distribution function (RDF). Finally, we discover the scaling law for steady patterns of the RDFs in terms of the fractional order $1<α\leq 2 $.
NAOct 27, 2016
A Petrov-Galerkin Spectral Element Method for Fractional Elliptic ProblemsEhsan Kharazmi, Mohsen Zayernouri, George Em Karniadakis
We develop a new $C^{\,0}$-continuous Petrov-Galerkin spectral element method for one-dimensional fractional elliptic problems of the form ${}_{0}{\mathcal{D}}_{x}^α u(x) - λu(x) = f(x)$, $α\in (1,2]$, subject to homogeneous boundary conditions. We employ the standard (modal) spectral element bases and the Jacobi poly-fractonomials as the test functions [1]. We formulate a new procedure for assembling the global linear system from elemental (local) mass and stiffness matrices. The Petrov-Galerkin formulation requires performing elemental (local) construction of mass and stiffness matrices in the standard domain only once. Moreover, we efficiently obtain the non-local (history) stiffness matrices, in which the non-locality is presented analytically for uniform grids. We also investigate two distinct choices of basis/test functions: i) local basis/test functions, and ii) local basis with global test functions. We show that the former choice leads to a better-conditioned system and accuracy. We consider smooth and singular solutions, where the singularity can occur at boundary points as well as in the interior domain. We also construct two non-uniform grids over the whole computational domain in order to capture singular solutions. Finally, we perform a systematic numerical study of non-local effects via full and partial history fading in order to further enhance the efficiency of the scheme.
NASep 29, 2017
A Riesz basis Galerkin method for the tempered fractional LaplacianZhijiang Zhang, Weihua Deng, George Em Karniadakis
The fractional Laplacian $Δ^{β/2}$ is the generator of $β$-stable Lévy process, which is the scaling limit of the Lévy fight. Due to the divergence of the second moment of the jump length of the Lévy fight it is not appropriate as a physical model in many practical applications. However, using a parameter $λ$ to exponentially temper the isotropic power law measure of the jump length leads to the tempered Lévy fight, which has finite second moment. For short time the tempered Lévy fight exhibits the dynamics of Lévy fight while after sufficiently long time it turns to normal diffusion. The generator of tempered $β$-stable Lévy process is the tempered fractional Laplacian $(Δ+λ)^{β/2}$ [W.H. Deng, B.Y. Li, W.Y. Tian, and P.W. Zhang, Multiscale Model. Simul., in press, 2017]. In the current work, we present new computational methods for the tempered fractional Laplacian equation, including the cases with the homogeneous and nonhomogeneous generalized Dirichlet type boundary conditions. We prove the well-posedness of the Galerkin weak formulation and provide convergence analysis of the single scaling B-spline and multiscale Riesz bases finite element methods. We propose a technique for efficiently generating the entries of the dense stiffness matrix and for solving the resulting algebraic equation by preconditioning. We also present several numerical experiments to verify the theoretical results.
LGNov 19, 2023
Uncertainty quantification for noisy inputs-outputs in physics-informed neural networks and neural operatorsZongren Zou, Xuhui Meng, George Em Karniadakis
Uncertainty quantification (UQ) in scientific machine learning (SciML) becomes increasingly critical as neural networks (NNs) are being widely adopted in addressing complex problems across various scientific disciplines. Representative SciML models are physics-informed neural networks (PINNs) and neural operators (NOs). While UQ in SciML has been increasingly investigated in recent years, very few works have focused on addressing the uncertainty caused by the noisy inputs, such as spatial-temporal coordinates in PINNs and input functions in NOs. The presence of noise in the inputs of the models can pose significantly more challenges compared to noise in the outputs of the models, primarily due to the inherent nonlinearity of most SciML algorithms. As a result, UQ for noisy inputs becomes a crucial factor for reliable and trustworthy deployment of these models in applications involving physical knowledge. To this end, we introduce a Bayesian approach to quantify uncertainty arising from noisy inputs-outputs in PINNs and NOs. We show that this approach can be seamlessly integrated into PINNs and NOs, when they are employed to encode the physical information. PINNs incorporate physics by including physics-informed terms via automatic differentiation, either in the loss function or the likelihood, and often take as input the spatial-temporal coordinate. Therefore, the present method equips PINNs with the capability to address problems where the observed coordinate is subject to noise. On the other hand, pretrained NOs are also commonly employed as equation-free surrogates in solving differential equations and Bayesian inverse problems, in which they take functions as inputs. The proposed approach enables them to handle noisy measurements for both input and output functions with UQ.
CVMar 15, 2023
ViTO: Vision Transformer-OperatorOded Ovadia, Adar Kahana, Panos Stinis et al.
We combine vision transformers with operator learning to solve diverse inverse problems described by partial differential equations (PDEs). Our approach, named ViTO, combines a U-Net based architecture with a vision transformer. We apply ViTO to solve inverse PDE problems of increasing complexity, namely for the wave equation, the Navier-Stokes equations and the Darcy equation. We focus on the more challenging case of super-resolution, where the input dataset for the inverse problem is at a significantly coarser resolution than the output. The results we obtain are comparable or exceed the leading operator network benchmarks in terms of accuracy. Furthermore, ViTO`s architecture has a small number of trainable parameters (less than 10% of the leading competitor), resulting in a performance speed-up of over 5x when averaged over the various test cases.
LGMar 3, 2023
Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural NetworksKatarzyna Michałowska, Somdatta Goswami, George Em Karniadakis et al.
Deep neural networks are an attractive alternative for simulating complex dynamical systems, as in comparison to traditional scientific computing methods, they offer reduced computational costs during inference and can be trained directly from observational data. Existing methods, however, cannot extrapolate accurately and are prone to error accumulation in long-time integration. Herein, we address this issue by combining neural operators with recurrent neural networks, learning the operator mapping, while offering a recurrent structure to capture temporal dependencies. The integrated framework is shown to stabilize the solution and reduce error accumulation for both interpolation and extrapolation of the Korteweg-de Vries equation.
LGMay 8, 2022
Neural operator learning of heterogeneous mechanobiological insults contributing to aortic aneurysmsSomdatta Goswami, David S. Li, Bruno V. Rego et al.
Thoracic aortic aneurysm (TAA) is a localized dilatation of the aorta resulting from compromised wall composition, structure, and function, which can lead to life-threatening dissection or rupture. Several genetic mutations and predisposing factors that contribute to TAA have been studied in mouse models to characterize specific changes in aortic microstructure and material properties that result from a wide range of mechanobiological insults. Assessments of TAA progression in vivo is largely limited to measurements of aneurysm size and growth rate. It has been shown that aortic geometry alone is not sufficient to predict the patient-specific progression of TAA but computational modeling of the evolving biomechanics of the aorta could predict future geometry and properties from initiating insults. In this work, we present an integrated framework to train a deep operator network (DeepONet)-based surrogate model to identify contributing factors for TAA by using FE-based datasets of aortic growth and remodeling resulting from prescribed insults. For training data, we investigate multiple types of TAA risk factors and spatial distributions within a constrained mixture model to generate axial--azimuthal maps of aortic dilatation and distensibility. The trained network is then capable of predicting the initial distribution and extent of the insult from a given set of dilatation and distensibility information. Two DeepONet frameworks are proposed, one trained on sparse information and one on full-field grayscale images, to gain insight into a preferred neural operator-based approach. Performance of the surrogate models is evaluated through multiple simulations carried out on insult distributions varying from fusiform to complex. We show that the proposed approach can predict patient-specific mechanobiological insult profile with a high accuracy, particularly when based on full-field images.
SDAug 9, 2023
Sound propagation in realistic interactive 3D scenes with parameterized sources using deep neural operatorsNikolas Borrel-Jensen, Somdatta Goswami, Allan P. Engsig-Karup et al.
We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver positions is intractable, making stimulating a sound field with moving sources impractical. To overcome this limitation, we propose using deep operator networks to approximate linear wave-equation operators. This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources, achieving millisecond-scale computations. By learning a compact surrogate model, we avoid the offline calculation and storage of impulse responses for all relevant source/listener pairs. Our experiments, including various complex scene geometries, show good agreement with reference solutions, with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our method signifies a paradigm shift as no prior machine learning approach has achieved precise predictions of complete wave fields within realistic domains. We anticipate that our findings will drive further exploration of deep neural operator methods, advancing research in immersive user experiences within virtual environments.$
NAAug 11, 2018
A new class of semi-implicit methods with linear complexity for nonlinear fractional differential equationsFanhai Zeng, Ian Turner, Kevin Burrage et al.
We propose a new class of semi-implicit methods for solving nonlinear fractional differential equations and study their stability. Several versions of our new schemes are proved to be unconditionally stable by choosing suitable parameters. Subsequently, we develop an efficient strategy to calculate the discrete convolution for the approximation of the fractional operator in the semi-implicit method and we derive an error bound of the fast convolution. The memory requirement and computational cost of the present semi-implicit methods with a fast convolution are about $O(N\log n_T)$ and $O(Nn_T\log n_T)$, respectively, where $N$ is a suitable positive integer and $n_T$ is the final number of time steps. Numerical simulations, including the solution of a system of two nonlinear fractional diffusion equations with different fractional orders in two-dimensions, are presented to verify the effectiveness of the semi-implicit methods.
96.9AIJun 3
Agents' Last ExamYiyou Sun, Xinyang Han, Weichen Zhang et al.
Recent AI systems have achieved strong results on a wide range of benchmarks, yet these gains have not translated into economically meaningful deployment across many professional domains. We argue that this gap is largely an evaluation problem: widely used benchmarks lack sustained performance measurement on real and economically valuable workflows. This paper introduces Agents' Last Exam (ALE), a benchmark designed to evaluate AI agents on long-horizon, economically valuable, real-world tasks with verifiable outcomes. Developed in collaboration with 250+ industry experts, ALE covers non-physical industries defined with reference to O*NET / SOC 2018 (the U.S. federal occupational taxonomy). It is organized around a task taxonomy with 55 subfields grouped into 13 industry clusters covering 1K+ tasks. Current results show that the hardest tier remains far from saturated: across mainstream harness and backbone configurations, the average full pass rate is 2.6%. ALE is designed as a living benchmark: its task pool grows continuously as new workflows and industries are onboarded. More broadly, ALE is intended not merely as another leaderboard, but as an instrument for closing the gap between benchmark success and GDP-relevant impact.
LGAug 5, 2024
Synergistic Learning with Multi-Task DeepONet for Efficient PDE Problem SolvingVarun Kumar, Somdatta Goswami, Katiana Kontolati et al.
Multi-task learning (MTL) is an inductive transfer mechanism designed to leverage useful information from multiple tasks to improve generalization performance compared to single-task learning. It has been extensively explored in traditional machine learning to address issues such as data sparsity and overfitting in neural networks. In this work, we apply MTL to problems in science and engineering governed by partial differential equations (PDEs). However, implementing MTL in this context is complex, as it requires task-specific modifications to accommodate various scenarios representing different physical processes. To this end, we present a multi-task deep operator network (MT-DeepONet) to learn solutions across various functional forms of source terms in a PDE and multiple geometries in a single concurrent training session. We introduce modifications in the branch network of the vanilla DeepONet to account for various functional forms of a parameterized coefficient in a PDE. Additionally, we handle parameterized geometries by introducing a binary mask in the branch network and incorporating it into the loss term to improve convergence and generalization to new geometry tasks. Our approach is demonstrated on three benchmark problems: (1) learning different functional forms of the source term in the Fisher equation; (2) learning multiple geometries in a 2D Darcy Flow problem and showcasing better transfer learning capabilities to new geometries; and (3) learning 3D parameterized geometries for a heat transfer problem and demonstrate the ability to predict on new but similar geometries. Our MT-DeepONet framework offers a novel approach to solving PDE problems in engineering and science under a unified umbrella based on synergistic learning that reduces the overall training cost for neural operators.
LGJul 30, 2024
NeuroSEM: A hybrid framework for simulating multiphysics problems by coupling PINNs and spectral elementsKhemraj Shukla, Zongren Zou, Chi Hin Chan et al.
Multiphysics problems that are characterized by complex interactions among fluid dynamics, heat transfer, structural mechanics, and electromagnetics, are inherently challenging due to their coupled nature. While experimental data on certain state variables may be available, integrating these data with numerical solvers remains a significant challenge. Physics-informed neural networks (PINNs) have shown promising results in various engineering disciplines, particularly in handling noisy data and solving inverse problems in partial differential equations (PDEs). However, their effectiveness in forecasting nonlinear phenomena in multiphysics regimes, particularly involving turbulence, is yet to be fully established. This study introduces NeuroSEM, a hybrid framework integrating PINNs with the high-fidelity Spectral Element Method (SEM) solver, Nektar++. NeuroSEM leverages the strengths of both PINNs and SEM, providing robust solutions for multiphysics problems. PINNs are trained to assimilate data and model physical phenomena in specific subdomains, which are then integrated into the Nektar++ solver. We demonstrate the efficiency and accuracy of NeuroSEM for thermal convection in cavity flow and flow past a cylinder. We applied NeuroSEM to the Rayleigh-Bénard convection system, including cases with missing thermal boundary conditions and noisy datasets, and to real particle image velocimetry (PIV) data to capture flow patterns characterized by horseshoe vortical structures. The framework's plug-and-play nature facilitates its extension to other multiphysics or multiscale problems. Furthermore, NeuroSEM is optimized for efficient execution on emerging integrated GPU-CPU architectures. This hybrid approach enhances the accuracy and efficiency of simulations, making it a powerful tool for tackling complex engineering challenges in various scientific domains.
LGApr 15, 2023
Learning in latent spaces improves the predictive accuracy of deep neural operatorsKatiana Kontolati, Somdatta Goswami, George Em Karniadakis et al.
Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.
COMP-PHSep 2, 2024
Two-stage initial-value iterative physics-informed neural networks for simulating solitary waves of nonlinear wave equationsJin Song, Ming Zhong, George Em Karniadakis et al.
We propose a new two-stage initial-value iterative neural network (IINN) algorithm for solitary wave computations of nonlinear wave equations based on traditional numerical iterative methods and physics-informed neural networks (PINNs). Specifically, the IINN framework consists of two subnetworks, one of which is used to fit a given initial value, and the other incorporates physical information and continues training on the basis of the first subnetwork. Importantly, the IINN method does not require any additional data information including boundary conditions, apart from the given initial value. Corresponding theoretical guarantees are provided to demonstrate the effectiveness of our IINN method. The proposed IINN method is efficiently applied to learn some types of solutions in different nonlinear wave equations, including the one-dimensional (1D) nonlinear Schrödinger equations (NLS) equation (with and without potentials), the 1D saturable NLS equation with PT -symmetric optical lattices, the 1D focusing-defocusing coupled NLS equations, the KdV equation, the two-dimensional (2D) NLS equation with potentials, the 2D amended GP equation with a potential, the (2+1)-dimensional KP equation, and the 3D NLS equation with a potential. These applications serve as evidence for the efficacy of our method. Finally, by comparing with the traditional methods, we demonstrate the advantages of the proposed IINN method.
FLU-DYNJul 22, 2024
Inferring turbulent velocity and temperature fields and their statistics from Lagrangian velocity measurements using physics-informed Kolmogorov-Arnold NetworksJuan Diego Toscano, Theo Käufer, Zhibo Wang et al.
We propose the Artificial Intelligence Velocimetry-Thermometry (AIVT) method to infer hidden temperature fields from experimental turbulent velocity data. This physics-informed machine learning method enables us to infer continuous temperature fields using only sparse velocity data, hence eliminating the need for direct temperature measurements. Specifically, AIVT is based on physics-informed Kolmogorov-Arnold Networks (not neural networks) and is trained by optimizing a combined loss function that minimizes the residuals of the velocity data, boundary conditions, and the governing equations. We apply AIVT to a unique set of experimental volumetric and simultaneous temperature and velocity data of Rayleigh-Bénard convection (RBC) that we acquired by combining Particle Image Thermometry and Lagrangian Particle Tracking. This allows us to compare AIVT predictions and measurements directly. We demonstrate that we can reconstruct and infer continuous and instantaneous velocity and temperature fields from sparse experimental data at a fidelity comparable to direct numerical simulations (DNS) of turbulence. This, in turn, enables us to compute important quantities for quantifying turbulence, such as fluctuations, viscous and thermal dissipation, and QR distribution. This paradigm shift in processing experimental data using AIVT to infer turbulent fields at DNS-level fidelity is a promising avenue in breaking the current deadlock of quantitative understanding of turbulence at high Reynolds numbers, where DNS is computationally infeasible.
FLU-DYNFeb 2, 2023
Deep neural operators can serve as accurate surrogates for shape optimization: A case study for airfoilsKhemraj Shukla, Vivek Oommen, Ahmad Peyvan et al.
Deep neural operators, such as DeepONets, have changed the paradigm in high-dimensional nonlinear regression from function regression to (differential) operator regression, paving the way for significant changes in computational engineering applications. Here, we investigate the use of DeepONets to infer flow fields around unseen airfoils with the aim of shape optimization, an important design problem in aerodynamics that typically taxes computational resources heavily. We present results which display little to no degradation in prediction accuracy, while reducing the online optimization cost by orders of magnitude. We consider NACA airfoils as a test case for our proposed approach, as their shape can be easily defined by the four-digit parametrization. We successfully optimize the constrained NACA four-digit problem with respect to maximizing the lift-to-drag ratio and validate all results by comparing them to a high-order CFD solver. We find that DeepONets have low generalization error, making them ideal for generating solutions of unseen shapes. Specifically, pressure, density, and velocity fields are accurately inferred at a fraction of a second, hence enabling the use of general objective functions beyond the maximization of the lift-to-drag ratio considered in the current work.
LGJul 18, 2023
Real-time Inference and Extrapolation via a Diffusion-inspired Temporal Transformer Operator (DiTTO)Oded Ovadia, Vivek Oommen, Adar Kahana et al.
Extrapolation remains a grand challenge in deep neural networks across all application domains. We propose an operator learning method to solve time-dependent partial differential equations (PDEs) continuously and with extrapolation in time without any temporal discretization. The proposed method, named Diffusion-inspired Temporal Transformer Operator (DiTTO), is inspired by latent diffusion models and their conditioning mechanism, which we use to incorporate the temporal evolution of the PDE, in combination with elements from the transformer architecture to improve its capabilities. Upon training, DiTTO can make inferences in real-time. We demonstrate its extrapolation capability on a climate problem by estimating the temperature around the globe for several years, and also in modeling hypersonic flows around a double-cone. We propose different training strategies involving temporal-bundling and sub-sampling and demonstrate performance improvements for several benchmarks, performing extrapolation for long time intervals as well as zero-shot super-resolution in time.
COMP-PHJan 26, 2023
A Hybrid Deep Neural Operator/Finite Element Method for Ice-Sheet ModelingQiZhi He, Mauro Perego, Amanda A. Howard et al.
One of the most challenging and consequential problems in climate modeling is to provide probabilistic projections of sea level rise. A large part of the uncertainty of sea level projections is due to uncertainty in ice sheet dynamics. At the moment, accurate quantification of the uncertainty is hindered by the cost of ice sheet computational models. In this work, we develop a hybrid approach to approximate existing ice sheet computational models at a fraction of their cost. Our approach consists of replacing the finite element model for the momentum equations for the ice velocity, the most expensive part of an ice sheet model, with a Deep Operator Network, while retaining a classic finite element discretization for the evolution of the ice thickness. We show that the resulting hybrid model is very accurate and it is an order of magnitude faster than the traditional finite element model. Further, a distinctive feature of the proposed model compared to other neural network approaches, is that it can handle high-dimensional parameter spaces (parameter fields) such as the basal friction at the bed of the glacier, and can therefore be used for generating samples for uncertainty quantification. We study the impact of hyper-parameters, number of unknowns and correlation length of the parameter distribution on the training and accuracy of the Deep Operator Network on a synthetic ice sheet model. We then target the evolution of the Humboldt glacier in Greenland and show that our hybrid model can provide accurate statistics of the glacier mass loss and can be effectively used to accelerate the quantification of uncertainty.
LGNov 26, 2023
Bias-Variance Trade-off in Physics-Informed Neural Networks with Randomized Smoothing for High-Dimensional PDEsZheyuan Hu, Zhouhao Yang, Yezhen Wang et al.
While physics-informed neural networks (PINNs) have been proven effective for low-dimensional partial differential equations (PDEs), the computational cost remains a hurdle in high-dimensional scenarios. This is particularly pronounced when computing high-order and high-dimensional derivatives in the physics-informed loss. Randomized Smoothing PINN (RS-PINN) introduces Gaussian noise for stochastic smoothing of the original neural net model, enabling Monte Carlo methods for derivative approximation, eliminating the need for costly auto-differentiation. Despite its computational efficiency in high dimensions, RS-PINN introduces biases in both loss and gradients, negatively impacting convergence, especially when coupled with stochastic gradient descent (SGD). We present a comprehensive analysis of biases in RS-PINN, attributing them to the nonlinearity of the Mean Squared Error (MSE) loss and the PDE nonlinearity. We propose tailored bias correction techniques based on the order of PDE nonlinearity. The unbiased RS-PINN allows for a detailed examination of its pros and cons compared to the biased version. Specifically, the biased version has a lower variance and runs faster than the unbiased version, but it is less accurate due to the bias. To optimize the bias-variance trade-off, we combine the two approaches in a hybrid method that balances the rapid convergence of the biased version with the high accuracy of the unbiased version. In addition, we present an enhanced implementation of RS-PINN. Extensive experiments on diverse high-dimensional PDEs, including Fokker-Planck, HJB, viscous Burgers', Allen-Cahn, and Sine-Gordon equations, illustrate the bias-variance trade-off and highlight the effectiveness of the hybrid RS-PINN. Empirical guidelines are provided for selecting biased, unbiased, or hybrid versions, depending on the dimensionality and nonlinearity of the specific PDE problem.
LGJun 27, 2023
MyCrunchGPT: A chatGPT assisted framework for scientific machine learningVarun Kumar, Leonard Gleyzer, Adar Kahana et al.
Scientific Machine Learning (SciML) has advanced recently across many different areas in computational science and engineering. The objective is to integrate data and physics seamlessly without the need of employing elaborate and computationally taxing data assimilation schemes. However, preprocessing, problem formulation, code generation, postprocessing and analysis are still time consuming and may prevent SciML from wide applicability in industrial applications and in digital twin frameworks. Here, we integrate the various stages of SciML under the umbrella of ChatGPT, to formulate MyCrunchGPT, which plays the role of a conductor orchestrating the entire workflow of SciML based on simple prompts by the user. Specifically, we present two examples that demonstrate the potential use of MyCrunchGPT in optimizing airfoils in aerodynamics, and in obtaining flow fields in various geometries in interactive mode, with emphasis on the validation stage. To demonstrate the flow of the MyCrunchGPT, and create an infrastructure that can facilitate a broader vision, we built a webapp based guided user interface, that includes options for a comprehensive summary report. The overall objective is to extend MyCrunchGPT to handle diverse problems in computational mechanics, design, optimization and controls, and general scientific computing tasks involved in SciML, hence using it as a research assistant tool but also as an educational tool. While here the examples focus in fluid mechanics, future versions will target solid mechanics and materials science, geophysics, systems biology and bioinformatics.
LGAug 29, 2024
Physics-Informed Neural Networks and ExtensionsMaziar Raissi, Paris Perdikaris, Nazanin Ahmadi et al.
In this paper, we review the new method Physics-Informed Neural Networks (PINNs) that has become the main pillar in scientific machine learning, we present recent practical extensions, and provide a specific example in data-driven discovery of governing differential equations.
LGOct 4, 2023
Learning characteristic parameters and dynamics of centrifugal pumps under multiphase flow using physics-informed neural networksFelipe de Castro Teixeira Carvalho, Kamaljyoti Nath, Alberto Luiz Serpa et al.
Electrical submersible pumps (ESPs) are prevalently utilized as artificial lift systems in the oil and gas industry. These pumps frequently encounter multiphase flows comprising a complex mixture of hydrocarbons, water, and sediments. Such mixtures lead to the formation of emulsions, characterized by an effective viscosity distinct from that of the individual phases. Traditional multiphase flow meters, employed to assess these conditions, are burdened by high operational costs and susceptibility to degradation. To this end, this study introduces a physics-informed neural network (PINN) model designed to indirectly estimate the fluid properties, dynamic states, and crucial parameters of an ESP system. A comprehensive structural and practical identifiability analysis was performed to delineate the subset of parameters that can be reliably estimated through the use of intake and discharge pressure measurements from the pump. The efficacy of the PINN model was validated by estimating the unknown states and parameters using these pressure measurements as input data. Furthermore, the performance of the PINN model was benchmarked against the particle filter method utilizing both simulated and experimental data across varying water content scenarios. The comparative analysis suggests that the PINN model holds significant potential as a viable alternative to conventional multiphase flow meters, offering a promising avenue for enhancing operational efficiency and reducing costs in ESP applications.
LGOct 30, 2023
Operator Learning Enhanced Physics-informed Neural Networks for Solving Partial Differential Equations Characterized by Sharp SolutionsBin Lin, Zhiping Mao, Zhicheng Wang et al.
Physics-informed Neural Networks (PINNs) have been shown as a promising approach for solving both forward and inverse problems of partial differential equations (PDEs). Meanwhile, the neural operator approach, including methods such as Deep Operator Network (DeepONet) and Fourier neural operator (FNO), has been introduced and extensively employed in approximating solution of PDEs. Nevertheless, to solve problems consisting of sharp solutions poses a significant challenge when employing these two approaches. To address this issue, we propose in this work a novel framework termed Operator Learning Enhanced Physics-informed Neural Networks (OL-PINN). Initially, we utilize DeepONet to learn the solution operator for a set of smooth problems relevant to the PDEs characterized by sharp solutions. Subsequently, we integrate the pre-trained DeepONet with PINN to resolve the target sharp solution problem. We showcase the efficacy of OL-PINN by successfully addressing various problems, such as the nonlinear diffusion-reaction equation, the Burgers equation and the incompressible Navier-Stokes equation at high Reynolds number. Compared with the vanilla PINN, the proposed method requires only a small number of residual points to achieve a strong generalization capability. Moreover, it substantially enhances accuracy, while also ensuring a robust training process. Furthermore, OL-PINN inherits the advantage of PINN for solving inverse problems. To this end, we apply the OL-PINN approach for solving problems with only partial boundary conditions, which usually cannot be solved by the classical numerical methods, showing its capacity in solving ill-posed problems and consequently more complex inverse problems.
AO-PHFeb 7, 2023
Learning bias corrections for climate models using deep neural operatorsAniruddha Bora, Khemraj Shukla, Shixuan Zhang et al.
Numerical simulation for climate modeling resolving all important scales is a computationally taxing process. Therefore, to circumvent this issue a low resolution simulation is performed, which is subsequently corrected for bias using reanalyzed data (ERA5), known as nudging correction. The existing implementation for nudging correction uses a relaxation based method for the algebraic difference between low resolution and ERA5 data. In this study, we replace the bias correction process with a surrogate model based on the Deep Operator Network (DeepONet). DeepONet (Deep Operator Neural Network) learns the mapping from the state before nudging (a functional) to the nudging tendency (another functional). The nudging tendency is a very high dimensional data albeit having many low energy modes. Therefore, the DeepoNet is combined with a convolution based auto-encoder-decoder (AED) architecture in order to learn the nudging tendency in a lower dimensional latent space efficiently. The accuracy of the DeepONet model is tested against the nudging tendency obtained from the E3SMv2 (Energy Exascale Earth System Model) and shows good agreement. The overarching goal of this work is to deploy the DeepONet model in an online setting and replace the nudging module in the E3SM loop for better efficiency and accuracy.
NANov 17, 2022
SMS: Spiking Marching Scheme for Efficient Long Time Integration of Differential EquationsQian Zhang, Adar Kahana, George Em Karniadakis et al.
We propose a Spiking Neural Network (SNN)-based explicit numerical scheme for long time integration of time-dependent Ordinary and Partial Differential Equations (ODEs, PDEs). The core element of the method is a SNN, trained to use spike-encoded information about the solution at previous timesteps to predict spike-encoded information at the next timestep. After the network has been trained, it operates as an explicit numerical scheme that can be used to compute the solution at future timesteps, given a spike-encoded initial condition. A decoder is used to transform the evolved spiking-encoded solution back to function values. We present results from numerical experiments of using the proposed method for ODEs and PDEs of varying complexity.