NAMay 9, 2017
A Priori Estimation Of Memory Effects In Coarse-Grained Nonlinear Systems Using The Mori-Zwanzig FormalismAyoub Gouasmi, Eric Parish, Karthik Duraisamy
Reduced Order Models (ROMs) of complex, nonlinear dynamical systems often require closure, which is the process of representing the contribution of the unresolved physics on the resolved physics. The Mori-Zwanzig (M-Z) procedure allows one to write down formally closed evolution equations for the resolved physics. In these equations, the unclosed terms are recast as a memory integral involving the past history of the resolved variables, and a "noise" term. While the M-Z procedure does not directly reduce the complexity of the original system, these equations can serve as a mathematically consistent starting point to develop closures based on approximations of the memory. In this scenario, a priori knowledge of the memory kernel, which is not explicitly known for nonlinear systems, is of paramount importance to assess the validity of a memory approximation. Unraveling the memory kernel requires the determination of the orthogonal dynamics which is a projected high-dimensional partial differential equation that is not tractable in general. A method to estimate the memory kernel a priori, using full-order solution snapshots, is proposed. The main idea is to solve a pseudo orthogonal dynamics equation, that has a convenient Liouville form, instead of the original one. This ersatz arises from the assumption that the semi-group of the orthogonal dynamics operator is a composition operator, akin to semi-groups of Liouville operators, for one observable. The method is exact in the linear case where the kernel is known explicitly. Results for under-resolved simulations of the Burgers and Kuramoto-Sivashinsky equations demonstrate that the proposed technique can accurately reconstruct the transfer of information between the resolved and unresolved dynamics through memory, and provide valuable information about the kernel.
FLU-DYNFeb 5, 2019
Reduced Order Modeling Framework for Combustor Instabilities Using Truncated Domain TrainingJiayang Xu, Cheng Huang, Karthik Duraisamy
A multi-fidelity framework is established and demonstrated for prediction of combustion instabilities in rocket engines. The major idea is to adapt appropriate fidelity modeling approaches for different components in a rocket engine to ensure accurate and efficient predictions. Specifically, the proposed framework integrates projection-based Reduced Order Models (ROMs) that are developed using bases generated on truncated domain simulations. The ROM training is performed on truncated domains, and thus does not require full order model solutions on the full rocket geometry, thus demonstrating the potential to greatly reduce training cost. Geometry-specific training is replaced by the response generated by perturbing the characteristics at the boundary of the truncated domain. This training method is shown to enhance predictive capabilities and robustness of the resulting ROMs, including at conditions outside the training range. Numerical tests are conducted on a quasi-1D model of a single-element rocket combustor and the present framework is compared to traditional ROM development approaches.
NAAug 3, 2018
Entropy Conservative Schemes and the Receding Flow ProblemAyoub Gouasmi, Scott Murman, Karthik Duraisamy
This work delves into the family of entropy conservative (EC) schemes introduced by Tadmor. The discussion is centered around the Euler equations of fluid mechanics and the receding flow problem extensively studied by Liou. This work is motivated by Liou's recent findings that an abnormal spike in temperature observed with finite-volume schemes is linked to a spurious entropy rise, and that it can be prevented in principle by conserving entropy. While a semi-discrete analysis suggests EC schemes are a good fit, a fully discrete analysis based on Tadmor's framework shows the non-negligible impact of time-integration on the solution behavior. An EC time-integration scheme is developed to show that enforcing conservation of entropy at the fully discrete level does not necessarily guarantee a well-behaved solution.
NADec 27, 2017
A Unified Framework for Multiscale Modeling using the Mori-Zwanzig Formalism and the Variational Multiscale MethodEric J. Parish, Karthik Duraisamy
We describe a paradigm for multiscale modeling that combines the Mori-Zwanzig (MZ) formalism of Statistical Mechanics with the Variational Multiscale (VMS) method. The MZ-VMS approach leverages both VMS scale-separation projectors as well as phase-space projectors to provide a systematic modeling approach that is applicable to non-linear partial differential equations. Spectral as well as continuous and discontinuous finite element methods are considered. The framework leads to a formally closed equation in which the effect of the unresolved scales on the resolved scales is non-local in time and appears as a convolution or memory integral. The resulting non-Markovian system is used as a starting point for model development. We discover that unresolved scales lead to memory effects that are driven by an orthogonal projection of the coarse-scale residual and inter-element jumps. It is further shown that an MZ-based finite memory model is a variant of the well-known adjoint-stabilization method. For hyperbolic equations, this stabilization is shown to have the form of an artificial viscosity term. We further establish connections between the memory kernel and approximate Riemann solvers. It is demonstrated that, in the case of one-dimensional linear advection, the assumption of a finite memory and a linear quadrature leads to a closure term that is formally equivalent to an upwind flux correction.
NANov 21, 2018
On entropy stable temporal fluxesAyoub Gouasmi, Karthik Duraisamy, Scott Murman
Entropy-stable (ES) schemes have gained considerable attention over the last decade, especially in the context of turbulent flow simulations using high-order methods. While promising because of their nonlinear stability properties, ES schemes have to address a number of issues to become practical. One of them is how much entropy should be produced by the scheme at a certain level of under-resolution. This problem has been so far studied by considering different ES interfaces fluxes in the spatial discretization only because they can be tuned to generate a certain amount of entropy. In this note, we point out that, in the context of space-time discretizations, the same applies to ES interface fluxes in the temporal direction.
23.7LGMay 27
History-aware adaptive reduced-order models via incremental singular value decompositionAmirpasha Hedayat, Ali Mohaghegh, Laura Balzano et al.
Reduced-order models (ROMs) can accelerate high-dimensional dynamical simulations, but their accuracy often deteriorates when online dynamics leave the regime represented by offline training data. We develop a projection-based adaptive ROM framework based on incremental singular value decomposition (iSVD), in which occasional full-order operator evaluations provide correction snapshots for online basis updates. The intrusive ROMs considered here are fully parameterized by the basis, so each update naturally propagates to reduced operators and hyper-reduction machinery. Through its evolving singular structure, iSVD retains an encoded history of the observed dynamics and is history-aware in this sense. We study the method on three nonlinear problems of increasing complexity: the one-dimensional viscous Burgers equation, the Sod shock tube, and a stiff one-dimensional ten-species rotating detonation engine (RDE). The Burgers problem is used to analyze the method and compare iSVD with alternative basis adaptation rules, showing that history-aware updates outperform instantaneous updates and that iSVD gives the strongest overall performance. The Sod and RDE cases demonstrate that these advantages persist in more challenging compressible-flow settings. For the RDE problem, the iSVD adaptive ROM improves upon the current state-of-the-art Direct adaptive ROM baseline in both predictive accuracy and computational efficiency. A cost analysis shows that the dominant online cost comes from interacting with the full-order model to obtain correction snapshots, while the iSVD update itself is negligible. These results identify iSVD as an effective mechanism for online learning of reduced subspaces and suggest a path toward ROMs that remain predictive over horizons several orders of magnitude longer than their initial training window.
DSApr 24, 2023
On the lifting and reconstruction of nonlinear systems with multiple invariant setsShaowu Pan, Karthik Duraisamy
The Koopman operator provides a linear perspective on non-linear dynamics by focusing on the evolution of observables in an invariant subspace. Observables of interest are typically linearly reconstructed from the Koopman eigenfunctions. Despite the broad use of Koopman operators over the past few years, there exist some misconceptions about the applicability of Koopman operators to dynamical systems with more than one disjoint invariant sets (e.g., basins of attractions from isolated fixed points). In this work, we first provide a simple explanation for the mechanism of linear reconstruction-based Koopman operators of nonlinear systems with multiple disjoint invariant sets. Next, we discuss the use of discrete symmetry among such invariant sets to construct Koopman eigenfunctions in a data efficient manner. Finally, several numerical examples are provided to illustrate the benefits of exploiting symmetry for learning the Koopman operator.
LGAug 30, 2024
Spatially-Aware Diffusion Models with Cross-Attention for Global Field Reconstruction with Sparse ObservationsYilin Zhuang, Sibo Cheng, Karthik Duraisamy
Diffusion models have gained attention for their ability to represent complex distributions and incorporate uncertainty, making them ideal for robust predictions in the presence of noisy or incomplete data. In this study, we develop and enhance score-based diffusion models in field reconstruction tasks, where the goal is to estimate complete spatial fields from partial observations. We introduce a condition encoding approach to construct a tractable mapping mapping between observed and unobserved regions using a learnable integration of sparse observations and interpolated fields as an inductive bias. With refined sensing representations and an unraveled temporal dimension, our method can handle arbitrary moving sensors and effectively reconstruct fields. Furthermore, we conduct a comprehensive benchmark of our approach against a deterministic interpolation-based method across various static and time-dependent PDEs. Our study attempts to addresses the gap in strong baselines for evaluating performance across varying sampling hyperparameters, noise levels, and conditioning methods. Our results show that diffusion models with cross-attention and the proposed conditional encoding generally outperform other methods under noisy conditions, although the deterministic method excels with noiseless data. Additionally, both the diffusion models and the deterministic method surpass the numerical approach in accuracy and computational cost for the steady problem. We also demonstrate the ability of the model to capture possible reconstructions and improve the accuracy of fused results in covariance-based correction tasks using ensemble sampling.
LGAug 24, 2023
Easy attention: A simple attention mechanism for temporal predictions with transformersMarcial Sanchis-Agudo, Yuning Wang, Roger Arnau et al.
To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in time-series reconstruction and prediction. While the standard self attention only makes use of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences. Through the singular-value decomposition (SVD) on the softmax attention score, we further observe that self attention compresses the contributions from both queries and keys in the space spanned by the attention score. Therefore, our proposed easy-attention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than self attention or the widely-used long short-term memory (LSTM) network. We show the improved performance of the easy-attention method in the Lorenz system, a turbulence shear flow and a model of a nuclear reactor.
LGJun 10, 2025Code
LaDCast: A Latent Diffusion Model for Medium-Range Ensemble Weather ForecastingYilin Zhuang, Karthik Duraisamy
Accurate probabilistic weather forecasting demands both high accuracy and efficient uncertainty quantification, challenges that overburden both ensemble numerical weather prediction (NWP) and recent machine-learning methods. We introduce LaDCast, the first global latent-diffusion framework for medium-range ensemble forecasting, which generates hourly ensemble forecasts entirely in a learned latent space. An autoencoder compresses high-dimensional ERA5 reanalysis fields into a compact representation, and a transformer-based diffusion model produces sequential latent updates with arbitrary hour initialization. The model incorporates Geometric Rotary Position Embedding (GeoRoPE) to account for the Earth's spherical geometry, a dual-stream attention mechanism for efficient conditioning, and sinusoidal temporal embeddings to capture seasonal patterns. LaDCast achieves deterministic and probabilistic skill close to that of the European Centre for Medium-Range Forecast IFS-ENS, without any explicit perturbations. Notably, LaDCast demonstrates superior performance in tracking rare extreme events such as cyclones, capturing their trajectories more accurately than established models. By operating in latent space, LaDCast reduces storage and compute by orders of magnitude, demonstrating a practical path toward forecasting at kilometer-scale resolution in real time. We open-source our code and models and provide the training and evaluation pipelines at: https://github.com/tonyzyl/ladcast.
95.9NAMay 13
Fully Discrete Active Flux Method based on Transported Acoustic Increments for the Compressible Euler EquationsKarthik Duraisamy
A fully discrete Active Flux method is proposed for the 2D compressible Euler equations. The method builds on the evolution-operator formulation proposed by Roe in which conservative cell averages are updated by unsplit flux quadrature while primitive point values are evolved by acoustic and advective subsolvers. The proposed method reconstructs the acoustic increment as a cellwise Q2 field and evaluates this field at the convective foot of the target point. For constant frozen coefficients, the resulting point update reduces to the transported composition, eliminating the additive split defect and yielding the exact unsplit frozen evolution when the acoustic and advective generators commute. The resulting method preserves the exact locally linearized acoustic evolution operator of Barsukow (2025), the compact stencil, and the conservative one-stage average update. Numerical experiments probe several facets of the numerical method. A mixed Fourier wave packet isolates the split error and shows third-order point accuracy for the transported update, compared with second-order behavior for the additive update. Isentropic vortex convection confirms third-order convergence for the full nonlinear scheme, reduced error constants, and an enlarged empirical CFL range. Nonlinear Gaussian acoustic pulse evolution demonstrates preservation of radial symmetry and near-third-order decay of the symmetry error. Low-Mach shear layer tests show coherent vorticity evolution, ultra-low entropy dissipation, and absence of the coarse-grid secondary vortices seen in displayed DG/CG comparisons. Finally, a compressible under-resolved Kelvin-Helmholtz test demonstrates robust no-limiter evolution to late time with consistent entropy dissipation. Fourier diagnostics of the vertical-edge point operator support the observed improvements in acoustic phase and amplification behavior.
LGDec 16, 2023
CoCoGen: Physically-Consistent and Conditioned Score-based Generative Models for Forward and Inverse ProblemsChristian Jacobsen, Yilin Zhuang, Karthik Duraisamy
Recent advances in generative artificial intelligence have had a significant impact on diverse domains spanning computer vision, natural language processing, and drug discovery. This work extends the reach of generative models into physical problem domains, particularly addressing the efficient enforcement of physical laws and conditioning for forward and inverse problems involving partial differential equations (PDEs). Our work introduces two key contributions: firstly, we present an efficient approach to promote consistency with the underlying PDE. By incorporating discretized information into score-based generative models, our method generates samples closely aligned with the true data distribution, showcasing residuals comparable to data generated through conventional PDE solvers, significantly enhancing fidelity. Secondly, we showcase the potential and versatility of score-based generative models in various physics tasks, specifically highlighting surrogate modeling as well as probabilistic field reconstruction and inversion from sparse measurements. A robust foundation is laid by designing unconditional score-based generative models that utilize reversible probability flow ordinary differential equations. Leveraging conditional models that require minimal training, we illustrate their flexibility when combined with a frozen unconditional model. These conditional models generate PDE solutions by incorporating parameters, macroscopic quantities, or partial field measurements as guidance. The results illustrate the inherent flexibility of score-based generative models and explore the synergy between unconditional score-based generative models and the present physically-consistent sampling approach, emphasizing the power and flexibility in solving for and inverting physical fields governed by differential equations, and in other scientific machine learning tasks.
LGApr 8, 2024
Variational Bayesian Optimal Experimental Design with Normalizing FlowsJiayuan Dong, Christian Jacobsen, Mehdi Khalloufi et al.
Bayesian optimal experimental design (OED) seeks experiments that maximize the expected information gain (EIG) in model parameters. Directly estimating the EIG using nested Monte Carlo is computationally expensive and requires an explicit likelihood. Variational OED (vOED), in contrast, estimates a lower bound of the EIG without likelihood evaluations by approximating the posterior distributions with variational forms, and then tightens the bound by optimizing its variational parameters. We introduce the use of normalizing flows (NFs) for representing variational distributions in vOED; we call this approach vOED-NFs. Specifically, we adopt NFs with a conditional invertible neural network architecture built from compositions of coupling layers, and enhanced with a summary network for data dimension reduction. We present Monte Carlo estimators to the lower bound along with gradient expressions to enable a gradient-based simultaneous optimization of the variational parameters and the design variables. The vOED-NFs algorithm is then validated in two benchmark problems, and demonstrated on a partial differential equation-governed application of cathodic electrophoretic deposition and an implicit likelihood case with stochastic modeling of aphid population. The findings suggest that a composition of 4--5 coupling layers is able to achieve lower EIG estimation bias, under a fixed budget of forward model runs, compared to previous approaches. The resulting NFs produce approximate posteriors that agree well with the true posteriors, able to capture non-Gaussian and multi-modal features effectively.
AIJun 26, 2025
Active Inference AI Systems for Scientific DiscoveryKarthik Duraisamy
The rapid evolution of artificial intelligence has led to expectations of transformative impact on science, yet current systems remain fundamentally limited in enabling genuine scientific discovery. This perspective contends that progress turns on closing three mutually reinforcing gaps in abstraction, reasoning and empirical grounding. Central to addressing these gaps is recognizing complementary cognitive modes: thinking as slow, iterative hypothesis generation -- exploring counterfactual spaces where physical laws can be temporarily violated to discover new patterns -- and reasoning as fast, deterministic validation, traversing established knowledge graphs to test consistency with known principles. Abstractions in this loop should be manipulable models that enable counterfactual prediction, causal attribution, and refinement. Design principles -- rather than a monolithic recipe -- are proposed for systems that reason in imaginary spaces and learn from the world: causal, multimodal models for internal simulation; persistent, uncertainty-aware scientific memory that distinguishes hypotheses from established claims; formal verification pathways coupled to computations and experiments. It is also argued that the inherent ambiguity in feedback from simulations and experiments, and underlying uncertainties make human judgment indispensable, not as a temporary scaffold but as a permanent architectural component. Evaluations must assess the system's ability to identify novel phenomena, propose falsifiable hypotheses, and efficiently guide experimental programs toward genuine discoveries.
CODec 14, 2025
Flow matching Operators for Residual-Augmented Probabilistic Learning of Partial Differential EquationsSahil Bhola, Karthik Duraisamy
Learning probabilistic surrogates for partial differential equations remains challenging in data-scarce regimes: neural operators require large amounts of high-fidelity data, while generative approaches typically sacrifice resolution invariance. We formulate flow matching in an infinite-dimensional function space to learn a probabilistic transport that maps low-fidelity approximations to the manifold of high-fidelity PDE solutions via learned residual corrections. We develop a conditional neural operator architecture based on feature-wise linear modulation for flow matching vector fields directly in function space, enabling inference at arbitrary spatial resolutions without retraining. To improve stability and representational control of the induced neural ODE, we parameterize the flow vector field as a sum of a linear operator and a nonlinear operator, combining lightweight linear components with a conditioned Fourier neural operator for expressive, input-dependent dynamics. We then formulate a residual-augmented learning strategy where the flow model learns probabilistic corrections from inexpensive low-fidelity surrogates to high-fidelity solutions, rather than learning the full solution mapping from scratch. Finally, we derive tractable training objectives that extend conditional flow matching to the operator setting with input-function-dependent couplings. To demonstrate the effectiveness of our approach, we present numerical experiments on a range of PDEs, including the 1D advection and Burgers' equation, and a 2D Darcy flow problem for flow through a porous medium. We show that the proposed method can accurately learn solution operators across different resolutions and fidelities and produces uncertainty estimates that appropriately reflect model confidence, even when trained on limited high-fidelity data.
LGFeb 11
Toward Adaptive Non-Intrusive Reduced-Order Models: Design and ChallengesAmirpasha Hedayat, Alberto Padovan, Karthik Duraisamy
Projection-based Reduced Order Models (ROMs) are often deployed as static surrogates, which limits their practical utility once a system leaves the training manifold. We formalize and study adaptive non-intrusive ROMs that update both the latent subspace and the reduced dynamics online. Building on ideas from static non-intrusive ROMs, specifically, Operator Inference (OpInf) and the recently-introduced Non-intrusive Trajectory-based optimization of Reduced-Order Models (NiTROM), we propose three formulations: Adaptive OpInf (sequential basis/operator refits), Adaptive NiTROM (joint Riemannian optimization of encoder/decoder and polynomial dynamics), and a hybrid that initializes NiTROM with an OpInf update. We describe the online data window, adaptation window, and computational budget, and analyze cost scaling. On a transiently perturbed lid-driven cavity flow, static Galerkin/OpInf/NiTROM drift or destabilize when forecasting beyond training. In contrast, Adaptive OpInf robustly suppresses amplitude drift with modest cost; Adaptive NiTROM is shown to attain near-exact energy tracking under frequent updates but is sensitive to its initialization and optimization depth; the hybrid is most reliable under regime changes and minimal offline data, yielding physically coherent fields and bounded energy. We argue that predictive claims for ROMs must be cost-aware and transparent, with clear separation of training/adaptation/deployment regimes and explicit reporting of online budgets and full-order model queries. This work provides a practical template for building self-correcting, non-intrusive ROMs that remain effective as the dynamics evolve well beyond the initial manifold.
LGNov 16, 2025
Attention-Enhanced Convolutional Autoencoder and Structured Delay Embeddings for Weather PredictionAmirpasha Hedayat, Karthik Duraisamy
Weather prediction is a quintessential problem involving the forecasting of a complex, nonlinear, and chaotic high-dimensional dynamical system. This work introduces an efficient reduced-order modeling (ROM) framework for short-range weather prediction and investigates fundamental questions in dimensionality reduction and reduced order modeling of such systems. Unlike recent AI-driven models, which require extensive computational resources, our framework prioritizes efficiency while achieving reasonable accuracy. Specifically, a ResNet-based convolutional autoencoder augmented by block attention modules is developed to reduce the dimensionality of high-dimensional weather data. Subsequently, a linear operator is learned in the time-delayed embedding of the latent space to efficiently capture the dynamics. Using the ERA5 reanalysis dataset, we demonstrate that this framework performs well in-distribution as evidenced by effectively predicting weather patterns within training data periods. We also identify important limitations in generalizing to future states, particularly in maintaining prediction accuracy beyond the training window. Our analysis reveals that weather systems exhibit strong temporal correlations that can be effectively captured through linear operations in an appropriately constructed embedding space, and that projection error rather than inference error is the main bottleneck. These findings shed light on some key challenges in reduced-order modeling of chaotic systems and point toward opportunities for hybrid approaches that combine efficient reduced-order models as baselines with more sophisticated AI architectures, particularly for applications in long-term climate modeling where computational efficiency is paramount.
CHEM-PHOct 20, 2025
Foundation Models for Discovery and Exploration in Chemical SpaceAlexius Wadell, Anoushka Bhutani, Victor Azumah et al.
Accurate prediction of atomistic, thermodynamic, and kinetic properties from molecular structures underpins materials innovation. Existing computational and experimental approaches lack the scalability required to efficiently navigate chemical space. Scientific foundation models trained on large unlabeled datasets offer a path toward exploring chemical space across diverse application domains. Here we develop MIST, a family of molecular foundation models with up to an order of magnitude more parameters and data than prior works. Trained using a novel tokenization scheme that comprehensively captures nuclear, electronic, and geometric information, MIST learns from a diverse range of molecules. MIST models have been fine-tuned to predict more than 400 structure -- property relationships and match or exceed state-of-the-art performance across benchmarks spanning physiology, electrochemistry, and quantum chemistry. We demonstrate the ability of these models to solve real-world problems across chemical space, including multiobjective electrolyte solvent screening, olfactory perception mapping, isotope half-life prediction, stereochemical reasoning for chiral organometallic compounds, and binary and multi-component mixture property prediction. Probing MIST models using mechanistic interpretability methods reveals identifiable patterns and trends not explicitly present in the training data, suggesting that the models learn generalizable scientific concepts. We formulate hyperparameter-penalized Bayesian neural scaling laws and use them to reduce the computational cost of model development by an order of magnitude. The methods and findings presented here represent a significant step toward accelerating materials discovery, design, and optimization using foundation models and provide valuable guidance for training compute-optimal scientific foundation models.
SYJun 9, 2025
A Digital Twin Framework for Generation-IV Reactors with Reinforcement Learning-Enabled Health-Aware Supervisory ControlJasmin Y. Lim, Dimitrios Pylorof, Humberto E. Garcia et al.
Generation IV (Gen-IV) nuclear power plants are envisioned to replace the current reactor fleet, bringing improvements in performance, safety, reliability, and sustainability. However, large cost investments currently inhibit the deployment of these advanced reactor concepts. Digital twins bridge real-world systems with digital tools to reduce costs, enhance decision-making, and boost operational efficiency. In this work, a digital twin framework is designed to operate the Gen-IV Fluoride-salt-cooled High-temperature Reactor, utilizing data-enhanced methods to optimize operational and maintenance policies while adhering to system constraints. The closed-loop framework integrates surrogate modeling, reinforcement learning, and Bayesian inference to streamline end-to-end communication for online regulation and self-adjustment. Reinforcement learning is used to consider component health and degradation to drive the target power generations, with constraints enforced through a Reference Governor control algorithm that ensures compliance with pump flow rate and temperature limits. These input driving modules benefit from detailed online simulations that are assimilated to measurement data with Bayesian filtering. The digital twin is demonstrated in three case studies: a one-year long-term operational period showcasing maintenance planning capabilities, short-term accuracy refinement with high-frequency measurements, and system shock capturing that demonstrates real-time recalibration capabilities when change in boundary conditions. These demonstrations validate robustness for health-aware and constraint-informed nuclear plant operation, with general applicability to other advanced reactor concepts and complex engineering systems.
COMP-PHJan 16, 2024
Enhancing Dynamical System Modeling through Interpretable Machine Learning Augmentations: A Case Study in Cathodic Electrophoretic DepositionChristian Jacobsen, Jiayuan Dong, Mehdi Khalloufi et al.
We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems, employing inference techniques and machine learning enhancements. As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating. Our approach illustrates a systematic procedure for enhancing physical models by identifying their limitations through inference on experimental data and introducing adaptable model enhancements to address these shortcomings. We begin by tackling the issue of model parameter identifiability, which reveals aspects of the model that require improvement. To address generalizability , we introduce modifications which also enhance identifiability. However, these modifications do not fully capture essential experimental behaviors. To overcome this limitation, we incorporate interpretable yet flexible augmentations into the baseline model. These augmentations are parameterized by simple fully-connected neural networks (FNNs), and we leverage machine learning tools, particularly Neural Ordinary Differential Equations (Neural ODEs), to learn these augmentations. Our simulations demonstrate that the machine learning-augmented model more accurately captures observed behaviors and improves predictive accuracy. Nevertheless, we contend that while the model updates offer superior performance and capture the relevant physics, we can reduce off-line computational costs by eliminating certain dynamics without compromising accuracy or interpretability in downstream predictions of quantities of interest, particularly film thickness predictions. The entire process outlined here provides a structured approach to leverage data-driven methods. Firstly, it helps us comprehend the root causes of model inaccuracies, and secondly, it offers a principled method for enhancing model performance.
LGSep 15, 2021
Conditionally Parameterized, Discretization-Aware Neural Networks for Mesh-Based Modeling of Physical SystemsJiayang Xu, Aniruddhe Pradhan, Karthik Duraisamy
Simulations of complex physical systems are typically realized by discretizing partial differential equations (PDEs) on unstructured meshes. While neural networks have recently been explored for surrogate and reduced order modeling of PDE solutions, they often ignore interactions or hierarchical relations between input features, and process them as concatenated mixtures. We generalize the idea of conditional parameterization -- using trainable functions of input parameters to generate the weights of a neural network, and extend them in a flexible way to encode critical information. Inspired by discretized numerical methods, choices of the parameters include physical quantities and mesh topology features. The functional relation between the modeled features and the parameters is built into the network architecture. The method is implemented on different networks and applied to frontier scientific machine learning tasks including the discovery of unmodeled physics, super-resolution of coarse fields, and the simulation of unsteady flows with chemical reactions. The results show that the conditionally-parameterized networks provide superior performance compared to their traditional counterparts. The CP-GNet - an architecture that can be trained on very few data snapshots - is proposed as the first deep learning model capable of standalone prediction of reacting flows on irregular meshes.
COMP-PHSep 15, 2021
Disentangling Generative Factors of Physical Fields Using Variational AutoencodersChristian Jacobsen, Karthik Duraisamy
The ability to extract generative parameters from high-dimensional fields of data in an unsupervised manner is a highly desirable yet unrealized goal in computational physics. This work explores the use of variational autoencoders (VAEs) for non-linear dimension reduction with the specific aim of {\em disentangling} the low-dimensional latent variables to identify independent physical parameters that generated the data. A disentangled decomposition is interpretable, and can be transferred to a variety of tasks including generative modeling, design optimization, and probabilistic reduced order modelling. A major emphasis of this work is to characterize disentanglement using VAEs while minimally modifying the classic VAE loss function (i.e. the Evidence Lower Bound) to maintain high reconstruction accuracy. The loss landscape is characterized by over-regularized local minima which surround desirable solutions. We illustrate comparisons between disentangled and entangled representations by juxtaposing learned latent distributions and the true generative factors in a model porous flow problem. Hierarchical priors are shown to facilitate the learning of disentangled representations. The regularization loss is unaffected by latent rotation when training with rotationally-invariant priors, and thus learning non-rotationally-invariant priors aids in capturing the properties of generative factors, improving disentanglement. Finally, it is shown that semi-supervised learning - accomplished by labeling a small number of samples ($O(1\%)$) - results in accurate disentangled latent representations that can be consistently learned.
COMP-PHSep 14, 2021
Discretization-independent surrogate modeling over complex geometries using hypernetworks and implicit representationsJames Duvall, Karthik Duraisamy, Shaowu Pan
Numerical solutions of partial differential equations (PDEs) require expensive simulations, limiting their application in design optimization, model-based control, and large-scale inverse problems. Surrogate modeling techniques seek to decrease the computational expense while retaining dominant solution features and behavior. Traditional Convolutional Neural Network-based frameworks for surrogate modeling require lossy pixelization and data-preprocessing, and generally are not effective in realistic engineering applications. We propose alternative deep-learning based surrogate models for discretization-independent, continuous representations of PDE solutions, which can be used for learning and prediction over domains with complex, variable geometry and mesh topology. Three methods are proposed and compared; design-variable-coded multi-layer perceptron (DV-MLP), design-variable hypernetworks (DV-Hnet), and non-linear independent dual system (NIDS). Each method utilizes a main network which consumes pointwise spatial information to provide a continuous representation, allowing predictions at any location in the domain. Input features include a minimum-distance function evaluation to implicitly encode the problem geometry. The geometric design variables, which define and distinguish problem instances, are used differently by each method, appearing as additional main-network input features (DV-MLP), or as hypernetwork inputs (DV-Hnet and NIDS). The methods are applied to predict solutions around complex, parametrically-defined geometries on non-parametrically-defined meshes with model predictions obtained many orders of magnitude faster than the full order models. Test cases include a vehicle-aerodynamics problem with complex geometry and limited training data, with a design-variable hypernetwork performing best, with a competitive time-to-best-model despite a much greater parameter count.
DSFeb 25, 2020
Sparsity-promoting algorithms for the discovery of informative Koopman invariant subspacesShaowu Pan, Nicholas Arnold-Medabalimi, Karthik Duraisamy
Koopman decomposition is a non-linear generalization of eigen-decomposition, and is being increasingly utilized in the analysis of spatio-temporal dynamics. Well-known techniques such as the dynamic mode decomposition (DMD) and its linear variants provide approximations to the Koopman operator, and have been applied extensively in many fluid dynamic problems. Despite being endowed with a richer dictionary of nonlinear observables, nonlinear variants of the DMD, such as extended/kernel dynamic mode decomposition (EDMD/KDMD) are seldom applied to large-scale problems primarily due to the difficulty of discerning the Koopman invariant subspace from thousands of resulting Koopman eigenmodes. To address this issue, we propose a framework based on multi-task feature learning to extract the most informative Koopman invariant subspace by removing redundant and spurious Koopman triplets. In particular, we develop a pruning procedure that penalizes departure from linear evolution. These algorithms can be viewed as sparsity promoting extensions of EDMD/KDMD. Further, we extend KDMD to a continuous-time setting and show a relationship between the present algorithm, sparsity-promoting DMD, and an empirical criterion from the viewpoint of non-convex optimization. The effectiveness of our algorithm is demonstrated on examples ranging from simple dynamical systems to two-dimensional cylinder wake flows at different Reynolds numbers and a three-dimensional turbulent ship air-wake flow. The latter two problems are designed such that very strong nonlinear transients are present, thus requiring an accurate approximation of the Koopman operator. Underlying physical mechanisms are analyzed, with an emphasis on characterizing transient dynamics. The results are compared to existing theoretical expositions and numerical approximations.
COMP-PHDec 23, 2019
Multi-level Convolutional Autoencoder Networks for Parametric Prediction of Spatio-temporal DynamicsJiayang Xu, Karthik Duraisamy
A data-driven framework is proposed towards the end of predictive modeling of complex spatio-temporal dynamics, leveraging nested non-linear manifolds. Three levels of neural networks are used, with the goal of predicting the future state of a system of interest in a parametric setting. A convolutional autoencoder is used as the top level to encode the high dimensional input data along spatial dimensions into a sequence of latent variables. A temporal convolutional autoencoder (TCAE) serves as the second level, which further encodes the output sequence from the first level along the temporal dimension, and outputs a set of latent variables that encapsulate the spatio-temporal evolution of the dynamics. The use of dilated temporal convolutions grows the receptive field exponentially with network depth, allowing for efficient processing of long temporal sequences typical of scientific computations. A fully-connected network is used as the third level to learn the mapping between these latent variables and the global parameters from training data, and predict them for new parameters. For future state predictions, the second level uses a temporal convolutional network to predict subsequent steps of the output sequence from the top level. Latent variables at the bottom-most level are decoded to obtain the dynamics in physical space at new global parameters and/or at a future time. Predictive capabilities are evaluated on a range of problems involving discontinuities, wave propagation, strong transients, and coherent structures. The sensitivity of the results to different modeling choices is assessed. The results suggest that given adequate data and careful training, effective data-driven predictive models can be constructed. Perspectives are provided on the present approach and its place in the landscape of model reduction.
DSJun 9, 2019
Physics-Informed Probabilistic Learning of Linear Embeddings of Non-linear Dynamics With Guaranteed StabilityShaowu Pan, Karthik Duraisamy
The Koopman operator has emerged as a powerful tool for the analysis of nonlinear dynamical systems as it provides coordinate transformations to globally linearize the dynamics. While recent deep learning approaches have been useful in extracting the Koopman operator from a data-driven perspective, several challenges remain. In this work, we formalize the problem of learning the continuous-time Koopman operator with deep neural networks in a measure-theoretic framework. Our approach induces two types of models: differential and recurrent form, the choice of which depends on the availability of the governing equations and data. We then enforce a structural parameterization that renders the realization of the Koopman operator provably stable. A new autoencoder architecture is constructed, such that only the residual of the dynamic mode decomposition is learned. Finally, we employ mean-field variational inference (MFVI) on the aforementioned framework in a hierarchical Bayesian setting to quantify uncertainties in the characterization and prediction of the dynamics of observables. The framework is evaluated on a simple polynomial system, the Duffing oscillator, and an unstable cylinder wake flow with noisy measurements.
MLMay 31, 2018
Long-time predictive modeling of nonlinear dynamical systems using neural networksShaowu Pan, Karthik Duraisamy
We study the use of feedforward neural networks (FNN) to develop models of nonlinear dynamical systems from data. Emphasis is placed on predictions at long times, with limited data availability. Inspired by global stability analysis, and the observation of the strong correlation between the local error and the maximum singular value of the Jacobian of the ANN, we introduce Jacobian regularization in the loss function. This regularization suppresses the sensitivity of the prediction to the local error and is shown to improve accuracy and robustness. Comparison between the proposed approach and sparse polynomial regression is presented in numerical examples ranging from simple ODE systems to nonlinear PDE systems including vortex shedding behind a cylinder, and instability-driven buoyant mixing flow. Furthermore, limitations of feedforward neural networks are highlighted, especially when the training data does not include a low dimensional attractor. Strategies of data augmentation are presented as remedies to address these issues to a certain extent.
DSMar 25, 2018
Data-driven Discovery of Closure ModelsShaowu Pan, Karthik Duraisamy
Derivation of reduced order representations of dynamical systems requires the modeling of the truncated dynamics on the retained dynamics. In its most general form, this so-called closure model has to account for memory effects. In this work, we present a framework of operator inference to extract the governing dynamics of closure from data in a compact, non-Markovian form. We employ sparse polynomial regression and artificial neural networks to extract the underlying operator. For a special class of non-linear systems, observability of the closure in terms of the resolved dynamics is analyzed and theoretical results are presented on the compactness of the memory. The proposed framework is evaluated on examples consisting of linear to nonlinear systems with and without chaotic dynamics, with an emphasis on predictive performance on unseen data.