Karthik Duraisamy

h-index29

22papers

609citations

Novelty53%

AI Score57

Ranked #5,552 of 194,257 authors (top 3%)#1,504 in LG (top 4%)

22 Papers

2.3NADec 27, 2017

A Unified Framework for Multiscale Modeling using the Mori-Zwanzig Formalism and the Variational Multiscale Method

Eric J. Parish, Karthik Duraisamy

We describe a paradigm for multiscale modeling that combines the Mori-Zwanzig (MZ) formalism of Statistical Mechanics with the Variational Multiscale (VMS) method. The MZ-VMS approach leverages both VMS scale-separation projectors as well as phase-space projectors to provide a systematic modeling approach that is applicable to non-linear partial differential equations. Spectral as well as continuous and discontinuous finite element methods are considered. The framework leads to a formally closed equation in which the effect of the unresolved scales on the resolved scales is non-local in time and appears as a convolution or memory integral. The resulting non-Markovian system is used as a starting point for model development. We discover that unresolved scales lead to memory effects that are driven by an orthogonal projection of the coarse-scale residual and inter-element jumps. It is further shown that an MZ-based finite memory model is a variant of the well-known adjoint-stabilization method. For hyperbolic equations, this stabilization is shown to have the form of an artificial viscosity term. We further establish connections between the memory kernel and approximate Riemann solvers. It is demonstrated that, in the case of one-dimensional linear advection, the assumption of a finite memory and a linear quadrature leads to a closure term that is formally equivalent to an upwind flux correction.

1.2NANov 21, 2018

On entropy stable temporal fluxes

Ayoub Gouasmi, Karthik Duraisamy, Scott Murman

Entropy-stable (ES) schemes have gained considerable attention over the last decade, especially in the context of turbulent flow simulations using high-order methods. While promising because of their nonlinear stability properties, ES schemes have to address a number of issues to become practical. One of them is how much entropy should be produced by the scheme at a certain level of under-resolution. This problem has been so far studied by considering different ES interfaces fluxes in the spatial discretization only because they can be tuned to generate a certain amount of entropy. In this note, we point out that, in the context of space-time discretizations, the same applies to ES interface fluxes in the temporal direction.

5.6LGMay 27

History-aware adaptive reduced-order models via incremental singular value decomposition

Amirpasha Hedayat, Ali Mohaghegh, Laura Balzano et al.

Reduced-order models (ROMs) can accelerate high-dimensional dynamical simulations, but their accuracy often deteriorates when online dynamics leave the regime represented by offline training data. We develop a projection-based adaptive ROM framework based on incremental singular value decomposition (iSVD), in which occasional full-order operator evaluations provide correction snapshots for online basis updates. The intrusive ROMs considered here are fully parameterized by the basis, so each update naturally propagates to reduced operators and hyper-reduction machinery. Through its evolving singular structure, iSVD retains an encoded history of the observed dynamics and is history-aware in this sense. We study the method on three nonlinear problems of increasing complexity: the one-dimensional viscous Burgers equation, the Sod shock tube, and a stiff one-dimensional ten-species rotating detonation engine (RDE). The Burgers problem is used to analyze the method and compare iSVD with alternative basis adaptation rules, showing that history-aware updates outperform instantaneous updates and that iSVD gives the strongest overall performance. The Sod and RDE cases demonstrate that these advantages persist in more challenging compressible-flow settings. For the RDE problem, the iSVD adaptive ROM improves upon the current state-of-the-art Direct adaptive ROM baseline in both predictive accuracy and computational efficiency. A cost analysis shows that the dominant online cost comes from interacting with the full-order model to obtain correction snapshots, while the iSVD update itself is negligible. These results identify iSVD as an effective mechanism for online learning of reduced subspaces and suggest a path toward ROMs that remain predictive over horizons several orders of magnitude longer than their initial training window.

9.9NAMay 13

Fully Discrete Active Flux Method based on Transported Acoustic Increments for the Compressible Euler Equations

Karthik Duraisamy

A fully discrete Active Flux method is proposed for the 2D compressible Euler equations. The method builds on the evolution-operator formulation proposed by Roe in which conservative cell averages are updated by unsplit flux quadrature while primitive point values are evolved by acoustic and advective subsolvers. The proposed method reconstructs the acoustic increment as a cellwise Q2 field and evaluates this field at the convective foot of the target point. For constant frozen coefficients, the resulting point update reduces to the transported composition, eliminating the additive split defect and yielding the exact unsplit frozen evolution when the acoustic and advective generators commute. The resulting method preserves the exact locally linearized acoustic evolution operator of Barsukow (2025), the compact stencil, and the conservative one-stage average update. Numerical experiments probe several facets of the numerical method. A mixed Fourier wave packet isolates the split error and shows third-order point accuracy for the transported update, compared with second-order behavior for the additive update. Isentropic vortex convection confirms third-order convergence for the full nonlinear scheme, reduced error constants, and an enlarged empirical CFL range. Nonlinear Gaussian acoustic pulse evolution demonstrates preservation of radial symmetry and near-third-order decay of the symmetry error. Low-Mach shear layer tests show coherent vorticity evolution, ultra-low entropy dissipation, and absence of the coarse-grid secondary vortices seen in displayed DG/CG comparisons. Finally, a compressible under-resolved Kelvin-Helmholtz test demonstrates robust no-limiter evolution to late time with consistent entropy dissipation. Fourier diagnostics of the vertical-edge point operator support the observed improvements in acoustic phase and amplification behavior.

5.1DSApr 24, 2023

On the lifting and reconstruction of nonlinear systems with multiple invariant sets

Shaowu Pan, Karthik Duraisamy

The Koopman operator provides a linear perspective on non-linear dynamics by focusing on the evolution of observables in an invariant subspace. Observables of interest are typically linearly reconstructed from the Koopman eigenfunctions. Despite the broad use of Koopman operators over the past few years, there exist some misconceptions about the applicability of Koopman operators to dynamical systems with more than one disjoint invariant sets (e.g., basins of attractions from isolated fixed points). In this work, we first provide a simple explanation for the mechanism of linear reconstruction-based Koopman operators of nonlinear systems with multiple disjoint invariant sets. Next, we discuss the use of discrete symmetry among such invariant sets to construct Koopman eigenfunctions in a data efficient manner. Finally, several numerical examples are provided to illustrate the benefits of exploiting symmetry for learning the Koopman operator.

17.0LGAug 30, 2024Code

Spatially-Aware Diffusion Models with Cross-Attention for Global Field Reconstruction with Sparse Observations

Yilin Zhuang, Sibo Cheng, Karthik Duraisamy

Diffusion models have gained attention for their ability to represent complex distributions and incorporate uncertainty, making them ideal for robust predictions in the presence of noisy or incomplete data. In this study, we develop and enhance score-based diffusion models in field reconstruction tasks, where the goal is to estimate complete spatial fields from partial observations. We introduce a condition encoding approach to construct a tractable mapping mapping between observed and unobserved regions using a learnable integration of sparse observations and interpolated fields as an inductive bias. With refined sensing representations and an unraveled temporal dimension, our method can handle arbitrary moving sensors and effectively reconstruct fields. Furthermore, we conduct a comprehensive benchmark of our approach against a deterministic interpolation-based method across various static and time-dependent PDEs. Our study attempts to addresses the gap in strong baselines for evaluating performance across varying sampling hyperparameters, noise levels, and conditioning methods. Our results show that diffusion models with cross-attention and the proposed conditional encoding generally outperform other methods under noisy conditions, although the deterministic method excels with noiseless data. Additionally, both the diffusion models and the deterministic method surpass the numerical approach in accuracy and computational cost for the steady problem. We also demonstrate the ability of the model to capture possible reconstructions and improve the accuracy of fused results in covariance-based correction tasks using ensemble sampling.

16.9LGJun 10, 2025Code

LaDCast: A Latent Diffusion Model for Medium-Range Ensemble Weather Forecasting

Yilin Zhuang, Karthik Duraisamy

Accurate probabilistic weather forecasting demands both high accuracy and efficient uncertainty quantification, challenges that overburden both ensemble numerical weather prediction (NWP) and recent machine-learning methods. We introduce LaDCast, the first global latent-diffusion framework for medium-range ensemble forecasting, which generates hourly ensemble forecasts entirely in a learned latent space. An autoencoder compresses high-dimensional ERA5 reanalysis fields into a compact representation, and a transformer-based diffusion model produces sequential latent updates with arbitrary hour initialization. The model incorporates Geometric Rotary Position Embedding (GeoRoPE) to account for the Earth's spherical geometry, a dual-stream attention mechanism for efficient conditioning, and sinusoidal temporal embeddings to capture seasonal patterns. LaDCast achieves deterministic and probabilistic skill close to that of the European Centre for Medium-Range Forecast IFS-ENS, without any explicit perturbations. Notably, LaDCast demonstrates superior performance in tracking rare extreme events such as cyclones, capturing their trajectories more accurately than established models. By operating in latent space, LaDCast reduces storage and compute by orders of magnitude, demonstrating a practical path toward forecasting at kilometer-scale resolution in real time. We open-source our code and models and provide the training and evaluation pipelines at: https://github.com/tonyzyl/ladcast.

25.0LGDec 16, 2023

CoCoGen: Physically-Consistent and Conditioned Score-based Generative Models for Forward and Inverse Problems

Christian Jacobsen, Yilin Zhuang, Karthik Duraisamy

Recent advances in generative artificial intelligence have had a significant impact on diverse domains spanning computer vision, natural language processing, and drug discovery. This work extends the reach of generative models into physical problem domains, particularly addressing the efficient enforcement of physical laws and conditioning for forward and inverse problems involving partial differential equations (PDEs). Our work introduces two key contributions: firstly, we present an efficient approach to promote consistency with the underlying PDE. By incorporating discretized information into score-based generative models, our method generates samples closely aligned with the true data distribution, showcasing residuals comparable to data generated through conventional PDE solvers, significantly enhancing fidelity. Secondly, we showcase the potential and versatility of score-based generative models in various physics tasks, specifically highlighting surrogate modeling as well as probabilistic field reconstruction and inversion from sparse measurements. A robust foundation is laid by designing unconditional score-based generative models that utilize reversible probability flow ordinary differential equations. Leveraging conditional models that require minimal training, we illustrate their flexibility when combined with a frozen unconditional model. These conditional models generate PDE solutions by incorporating parameters, macroscopic quantities, or partial field measurements as guidance. The results illustrate the inherent flexibility of score-based generative models and explore the synergy between unconditional score-based generative models and the present physically-consistent sampling approach, emphasizing the power and flexibility in solving for and inverting physical fields governed by differential equations, and in other scientific machine learning tasks.

17.6LGApr 8, 2024

Variational Bayesian Optimal Experimental Design with Normalizing Flows

Jiayuan Dong, Christian Jacobsen, Mehdi Khalloufi et al.

Bayesian optimal experimental design (OED) seeks experiments that maximize the expected information gain (EIG) in model parameters. Directly estimating the EIG using nested Monte Carlo is computationally expensive and requires an explicit likelihood. Variational OED (vOED), in contrast, estimates a lower bound of the EIG without likelihood evaluations by approximating the posterior distributions with variational forms, and then tightens the bound by optimizing its variational parameters. We introduce the use of normalizing flows (NFs) for representing variational distributions in vOED; we call this approach vOED-NFs. Specifically, we adopt NFs with a conditional invertible neural network architecture built from compositions of coupling layers, and enhanced with a summary network for data dimension reduction. We present Monte Carlo estimators to the lower bound along with gradient expressions to enable a gradient-based simultaneous optimization of the variational parameters and the design variables. The vOED-NFs algorithm is then validated in two benchmark problems, and demonstrated on a partial differential equation-governed application of cathodic electrophoretic deposition and an implicit likelihood case with stochastic modeling of aphid population. The findings suggest that a composition of 4--5 coupling layers is able to achieve lower EIG estimation bias, under a fixed budget of forward model runs, compared to previous approaches. The resulting NFs produce approximate posteriors that agree well with the true posteriors, able to capture non-Gaussian and multi-modal features effectively.

7.8AIJun 26, 2025

Active Inference AI Systems for Scientific Discovery

Karthik Duraisamy

The rapid evolution of artificial intelligence has led to expectations of transformative impact on science, yet current systems remain fundamentally limited in enabling genuine scientific discovery. This perspective contends that progress turns on closing three mutually reinforcing gaps in abstraction, reasoning and empirical grounding. Central to addressing these gaps is recognizing complementary cognitive modes: thinking as slow, iterative hypothesis generation -- exploring counterfactual spaces where physical laws can be temporarily violated to discover new patterns -- and reasoning as fast, deterministic validation, traversing established knowledge graphs to test consistency with known principles. Abstractions in this loop should be manipulable models that enable counterfactual prediction, causal attribution, and refinement. Design principles -- rather than a monolithic recipe -- are proposed for systems that reason in imaginary spaces and learn from the world: causal, multimodal models for internal simulation; persistent, uncertainty-aware scientific memory that distinguishes hypotheses from established claims; formal verification pathways coupled to computations and experiments. It is also argued that the inherent ambiguity in feedback from simulations and experiments, and underlying uncertainties make human judgment indispensable, not as a temporary scaffold but as a permanent architectural component. Evaluations must assess the system's ability to identify novel phenomena, propose falsifiable hypotheses, and efficiently guide experimental programs toward genuine discoveries.

2.3CODec 14, 2025

Flow matching Operators for Residual-Augmented Probabilistic Learning of Partial Differential Equations

Sahil Bhola, Karthik Duraisamy

Learning probabilistic surrogates for partial differential equations remains challenging in data-scarce regimes: neural operators require large amounts of high-fidelity data, while generative approaches typically sacrifice resolution invariance. We formulate flow matching in an infinite-dimensional function space to learn a probabilistic transport that maps low-fidelity approximations to the manifold of high-fidelity PDE solutions via learned residual corrections. We develop a conditional neural operator architecture based on feature-wise linear modulation for flow matching vector fields directly in function space, enabling inference at arbitrary spatial resolutions without retraining. To improve stability and representational control of the induced neural ODE, we parameterize the flow vector field as a sum of a linear operator and a nonlinear operator, combining lightweight linear components with a conditioned Fourier neural operator for expressive, input-dependent dynamics. We then formulate a residual-augmented learning strategy where the flow model learns probabilistic corrections from inexpensive low-fidelity surrogates to high-fidelity solutions, rather than learning the full solution mapping from scratch. Finally, we derive tractable training objectives that extend conditional flow matching to the operator setting with input-function-dependent couplings. To demonstrate the effectiveness of our approach, we present numerical experiments on a range of PDEs, including the 1D advection and Burgers' equation, and a 2D Darcy flow problem for flow through a porous medium. We show that the proposed method can accurately learn solution operators across different resolutions and fidelities and produces uncertainty estimates that appropriately reflect model confidence, even when trained on limited high-fidelity data.

2.7LGFeb 11

Toward Adaptive Non-Intrusive Reduced-Order Models: Design and Challenges

Amirpasha Hedayat, Alberto Padovan, Karthik Duraisamy

Projection-based Reduced Order Models (ROMs) are often deployed as static surrogates, which limits their practical utility once a system leaves the training manifold. We formalize and study adaptive non-intrusive ROMs that update both the latent subspace and the reduced dynamics online. Building on ideas from static non-intrusive ROMs, specifically, Operator Inference (OpInf) and the recently-introduced Non-intrusive Trajectory-based optimization of Reduced-Order Models (NiTROM), we propose three formulations: Adaptive OpInf (sequential basis/operator refits), Adaptive NiTROM (joint Riemannian optimization of encoder/decoder and polynomial dynamics), and a hybrid that initializes NiTROM with an OpInf update. We describe the online data window, adaptation window, and computational budget, and analyze cost scaling. On a transiently perturbed lid-driven cavity flow, static Galerkin/OpInf/NiTROM drift or destabilize when forecasting beyond training. In contrast, Adaptive OpInf robustly suppresses amplitude drift with modest cost; Adaptive NiTROM is shown to attain near-exact energy tracking under frequent updates but is sensitive to its initialization and optimization depth; the hybrid is most reliable under regime changes and minimal offline data, yielding physically coherent fields and bounded energy. We argue that predictive claims for ROMs must be cost-aware and transparent, with clear separation of training/adaptation/deployment regimes and explicit reporting of online budgets and full-order model queries. This work provides a practical template for building self-correcting, non-intrusive ROMs that remain effective as the dynamics evolve well beyond the initial manifold.

4.1LGNov 16, 2025

Attention-Enhanced Convolutional Autoencoder and Structured Delay Embeddings for Weather Prediction

Amirpasha Hedayat, Karthik Duraisamy

Weather prediction is a quintessential problem involving the forecasting of a complex, nonlinear, and chaotic high-dimensional dynamical system. This work introduces an efficient reduced-order modeling (ROM) framework for short-range weather prediction and investigates fundamental questions in dimensionality reduction and reduced order modeling of such systems. Unlike recent AI-driven models, which require extensive computational resources, our framework prioritizes efficiency while achieving reasonable accuracy. Specifically, a ResNet-based convolutional autoencoder augmented by block attention modules is developed to reduce the dimensionality of high-dimensional weather data. Subsequently, a linear operator is learned in the time-delayed embedding of the latent space to efficiently capture the dynamics. Using the ERA5 reanalysis dataset, we demonstrate that this framework performs well in-distribution as evidenced by effectively predicting weather patterns within training data periods. We also identify important limitations in generalizing to future states, particularly in maintaining prediction accuracy beyond the training window. Our analysis reveals that weather systems exhibit strong temporal correlations that can be effectively captured through linear operations in an appropriately constructed embedding space, and that projection error rather than inference error is the main bottleneck. These findings shed light on some key challenges in reduced-order modeling of chaotic systems and point toward opportunities for hybrid approaches that combine efficient reduced-order models as baselines with more sophisticated AI architectures, particularly for applications in long-term climate modeling where computational efficiency is paramount.

6.6CHEM-PHOct 20, 2025

Foundation Models for Discovery and Exploration in Chemical Space

Alexius Wadell, Anoushka Bhutani, Victor Azumah et al.

Accurate prediction of atomistic, thermodynamic, and kinetic properties from molecular structures underpins materials innovation. Existing computational and experimental approaches lack the scalability required to efficiently navigate chemical space. Scientific foundation models trained on large unlabeled datasets offer a path toward exploring chemical space across diverse application domains. Here we develop MIST, a family of molecular foundation models with up to an order of magnitude more parameters and data than prior works. Trained using a novel tokenization scheme that comprehensively captures nuclear, electronic, and geometric information, MIST learns from a diverse range of molecules. MIST models have been fine-tuned to predict more than 400 structure -- property relationships and match or exceed state-of-the-art performance across benchmarks spanning physiology, electrochemistry, and quantum chemistry. We demonstrate the ability of these models to solve real-world problems across chemical space, including multiobjective electrolyte solvent screening, olfactory perception mapping, isotope half-life prediction, stereochemical reasoning for chiral organometallic compounds, and binary and multi-component mixture property prediction. Probing MIST models using mechanistic interpretability methods reveals identifiable patterns and trends not explicitly present in the training data, suggesting that the models learn generalizable scientific concepts. We formulate hyperparameter-penalized Bayesian neural scaling laws and use them to reduce the computational cost of model development by an order of magnitude. The methods and findings presented here represent a significant step toward accelerating materials discovery, design, and optimization using foundation models and provide valuable guidance for training compute-optimal scientific foundation models.

1.2SYJun 9, 2025

A Digital Twin Framework for Generation-IV Reactors with Reinforcement Learning-Enabled Health-Aware Supervisory Control

Jasmin Y. Lim, Dimitrios Pylorof, Humberto E. Garcia et al.

Generation IV (Gen-IV) nuclear power plants are envisioned to replace the current reactor fleet, bringing improvements in performance, safety, reliability, and sustainability. However, large cost investments currently inhibit the deployment of these advanced reactor concepts. Digital twins bridge real-world systems with digital tools to reduce costs, enhance decision-making, and boost operational efficiency. In this work, a digital twin framework is designed to operate the Gen-IV Fluoride-salt-cooled High-temperature Reactor, utilizing data-enhanced methods to optimize operational and maintenance policies while adhering to system constraints. The closed-loop framework integrates surrogate modeling, reinforcement learning, and Bayesian inference to streamline end-to-end communication for online regulation and self-adjustment. Reinforcement learning is used to consider component health and degradation to drive the target power generations, with constraints enforced through a Reference Governor control algorithm that ensures compliance with pump flow rate and temperature limits. These input driving modules benefit from detailed online simulations that are assimilated to measurement data with Bayesian filtering. The digital twin is demonstrated in three case studies: a one-year long-term operational period showcasing maintenance planning capabilities, short-term accuracy refinement with high-frequency measurements, and system shock capturing that demonstrates real-time recalibration capabilities when change in boundary conditions. These demonstrations validate robustness for health-aware and constraint-informed nuclear plant operation, with general applicability to other advanced reactor concepts and complex engineering systems.

2.3COMP-PHJan 16, 2024

Enhancing Dynamical System Modeling through Interpretable Machine Learning Augmentations: A Case Study in Cathodic Electrophoretic Deposition

Christian Jacobsen, Jiayuan Dong, Mehdi Khalloufi et al.

We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems, employing inference techniques and machine learning enhancements. As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating. Our approach illustrates a systematic procedure for enhancing physical models by identifying their limitations through inference on experimental data and introducing adaptable model enhancements to address these shortcomings. We begin by tackling the issue of model parameter identifiability, which reveals aspects of the model that require improvement. To address generalizability , we introduce modifications which also enhance identifiability. However, these modifications do not fully capture essential experimental behaviors. To overcome this limitation, we incorporate interpretable yet flexible augmentations into the baseline model. These augmentations are parameterized by simple fully-connected neural networks (FNNs), and we leverage machine learning tools, particularly Neural Ordinary Differential Equations (Neural ODEs), to learn these augmentations. Our simulations demonstrate that the machine learning-augmented model more accurately captures observed behaviors and improves predictive accuracy. Nevertheless, we contend that while the model updates offer superior performance and capture the relevant physics, we can reduce off-line computational costs by eliminating certain dynamics without compromising accuracy or interpretability in downstream predictions of quantities of interest, particularly film thickness predictions. The entire process outlined here provides a structured approach to leverage data-driven methods. Firstly, it helps us comprehend the root causes of model inaccuracies, and secondly, it offers a principled method for enhancing model performance.

5.9COMP-PHSep 15, 2021

Disentangling Generative Factors of Physical Fields Using Variational Autoencoders

Christian Jacobsen, Karthik Duraisamy

The ability to extract generative parameters from high-dimensional fields of data in an unsupervised manner is a highly desirable yet unrealized goal in computational physics. This work explores the use of variational autoencoders (VAEs) for non-linear dimension reduction with the specific aim of {\em disentangling} the low-dimensional latent variables to identify independent physical parameters that generated the data. A disentangled decomposition is interpretable, and can be transferred to a variety of tasks including generative modeling, design optimization, and probabilistic reduced order modelling. A major emphasis of this work is to characterize disentanglement using VAEs while minimally modifying the classic VAE loss function (i.e. the Evidence Lower Bound) to maintain high reconstruction accuracy. The loss landscape is characterized by over-regularized local minima which surround desirable solutions. We illustrate comparisons between disentangled and entangled representations by juxtaposing learned latent distributions and the true generative factors in a model porous flow problem. Hierarchical priors are shown to facilitate the learning of disentangled representations. The regularization loss is unaffected by latent rotation when training with rotationally-invariant priors, and thus learning non-rotationally-invariant priors aids in capturing the properties of generative factors, improving disentanglement. Finally, it is shown that semi-supervised learning - accomplished by labeling a small number of samples ($O(1\%)$) - results in accurate disentangled latent representations that can be consistently learned.

5.1COMP-PHSep 14, 2021Code

Discretization-independent surrogate modeling over complex geometries using hypernetworks and implicit representations

James Duvall, Karthik Duraisamy, Shaowu Pan

Numerical solutions of partial differential equations (PDEs) require expensive simulations, limiting their application in design optimization, model-based control, and large-scale inverse problems. Surrogate modeling techniques seek to decrease the computational expense while retaining dominant solution features and behavior. Traditional Convolutional Neural Network-based frameworks for surrogate modeling require lossy pixelization and data-preprocessing, and generally are not effective in realistic engineering applications. We propose alternative deep-learning based surrogate models for discretization-independent, continuous representations of PDE solutions, which can be used for learning and prediction over domains with complex, variable geometry and mesh topology. Three methods are proposed and compared; design-variable-coded multi-layer perceptron (DV-MLP), design-variable hypernetworks (DV-Hnet), and non-linear independent dual system (NIDS). Each method utilizes a main network which consumes pointwise spatial information to provide a continuous representation, allowing predictions at any location in the domain. Input features include a minimum-distance function evaluation to implicitly encode the problem geometry. The geometric design variables, which define and distinguish problem instances, are used differently by each method, appearing as additional main-network input features (DV-MLP), or as hypernetwork inputs (DV-Hnet and NIDS). The methods are applied to predict solutions around complex, parametrically-defined geometries on non-parametrically-defined meshes with model predictions obtained many orders of magnitude faster than the full order models. Test cases include a vehicle-aerodynamics problem with complex geometry and limited training data, with a design-variable hypernetwork performing best, with a competitive time-to-best-model despite a much greater parameter count.

20.6COMP-PHDec 23, 2019

Multi-level Convolutional Autoencoder Networks for Parametric Prediction of Spatio-temporal Dynamics

Jiayang Xu, Karthik Duraisamy

A data-driven framework is proposed towards the end of predictive modeling of complex spatio-temporal dynamics, leveraging nested non-linear manifolds. Three levels of neural networks are used, with the goal of predicting the future state of a system of interest in a parametric setting. A convolutional autoencoder is used as the top level to encode the high dimensional input data along spatial dimensions into a sequence of latent variables. A temporal convolutional autoencoder (TCAE) serves as the second level, which further encodes the output sequence from the first level along the temporal dimension, and outputs a set of latent variables that encapsulate the spatio-temporal evolution of the dynamics. The use of dilated temporal convolutions grows the receptive field exponentially with network depth, allowing for efficient processing of long temporal sequences typical of scientific computations. A fully-connected network is used as the third level to learn the mapping between these latent variables and the global parameters from training data, and predict them for new parameters. For future state predictions, the second level uses a temporal convolutional network to predict subsequent steps of the output sequence from the top level. Latent variables at the bottom-most level are decoded to obtain the dynamics in physical space at new global parameters and/or at a future time. Predictive capabilities are evaluated on a range of problems involving discontinuities, wave propagation, strong transients, and coherent structures. The sensitivity of the results to different modeling choices is assessed. The results suggest that given adequate data and careful training, effective data-driven predictive models can be constructed. Perspectives are provided on the present approach and its place in the landscape of model reduction.

19.3DSJun 9, 2019Code

Physics-Informed Probabilistic Learning of Linear Embeddings of Non-linear Dynamics With Guaranteed Stability

Shaowu Pan, Karthik Duraisamy

The Koopman operator has emerged as a powerful tool for the analysis of nonlinear dynamical systems as it provides coordinate transformations to globally linearize the dynamics. While recent deep learning approaches have been useful in extracting the Koopman operator from a data-driven perspective, several challenges remain. In this work, we formalize the problem of learning the continuous-time Koopman operator with deep neural networks in a measure-theoretic framework. Our approach induces two types of models: differential and recurrent form, the choice of which depends on the availability of the governing equations and data. We then enforce a structural parameterization that renders the realization of the Koopman operator provably stable. A new autoencoder architecture is constructed, such that only the residual of the dynamic mode decomposition is learned. Finally, we employ mean-field variational inference (MFVI) on the aforementioned framework in a hierarchical Bayesian setting to quantify uncertainties in the characterization and prediction of the dynamics of observables. The framework is evaluated on a simple polynomial system, the Duffing oscillator, and an unstable cylinder wake flow with noisy measurements.

16.3MLMay 31, 2018

Long-time predictive modeling of nonlinear dynamical systems using neural networks

Shaowu Pan, Karthik Duraisamy

We study the use of feedforward neural networks (FNN) to develop models of nonlinear dynamical systems from data. Emphasis is placed on predictions at long times, with limited data availability. Inspired by global stability analysis, and the observation of the strong correlation between the local error and the maximum singular value of the Jacobian of the ANN, we introduce Jacobian regularization in the loss function. This regularization suppresses the sensitivity of the prediction to the local error and is shown to improve accuracy and robustness. Comparison between the proposed approach and sparse polynomial regression is presented in numerical examples ranging from simple ODE systems to nonlinear PDE systems including vortex shedding behind a cylinder, and instability-driven buoyant mixing flow. Furthermore, limitations of feedforward neural networks are highlighted, especially when the training data does not include a low dimensional attractor. Strategies of data augmentation are presented as remedies to address these issues to a certain extent.

2.1LGNov 6, 2015

Efficient Multiscale Gaussian Process Regression using Hierarchical Clustering

Z. Zhang, K. Duraisamy, N. A. Gumerov

Standard Gaussian Process (GP) regression, a powerful machine learning tool, is computationally expensive when it is applied to large datasets, and potentially inaccurate when data points are sparsely distributed in a high-dimensional feature space. To address these challenges, a new multiscale, sparsified GP algorithm is formulated, with the goal of application to large scientific computing datasets. In this approach, the data is partitioned into clusters and the cluster centers are used to define a reduced training set, resulting in an improvement over standard GPs in terms of training and evaluation costs. Further, a hierarchical technique is used to adaptively map the local covariance representation to the underlying sparsity of the feature space, leading to improved prediction accuracy when the data distribution is highly non-uniform. A theoretical investigation of the computational complexity of the algorithm is presented. The efficacy of this method is then demonstrated on smooth and discontinuous analytical functions and on data from a direct numerical simulation of turbulent combustion.