LGFeb 13, 2023
Multiscale Graph Neural Network Autoencoders for Interpretable Scientific Machine LearningShivam Barwey, Varun Shankar, Venkatasubramanian Viswanathan et al.
The goal of this work is to address two limitations in autoencoder-based models: latent space interpretability and compatibility with unstructured meshes. This is accomplished here with the development of a novel graph neural network (GNN) autoencoding architecture with demonstrations on complex fluid flow applications. To address the first goal of interpretability, the GNN autoencoder achieves reduction in the number nodes in the encoding stage through an adaptive graph reduction procedure. This reduction procedure essentially amounts to flowfield-conditioned node sampling and sensor identification, and produces interpretable latent graph representations tailored to the flowfield reconstruction task in the form of so-called masked fields. These masked fields allow the user to (a) visualize where in physical space a given latent graph is active, and (b) interpret the time-evolution of the latent graph connectivity in accordance with the time-evolution of unsteady flow features (e.g. recirculation zones, shear layers) in the domain. To address the goal of unstructured mesh compatibility, the autoencoding architecture utilizes a series of multi-scale message passing (MMP) layers, each of which models information exchange among node neighborhoods at various lengthscales. The MMP layer, which augments standard single-scale message passing with learnable coarsening operations, allows the decoder to more efficiently reconstruct the flowfield from the identified regions in the masked fields. Analysis of latent graphs produced by the autoencoder for various model settings are conducted using using unstructured snapshot data sourced from large-eddy simulations in a backward-facing step (BFS) flow configuration with an OpenFOAM-based flow solver at high Reynolds numbers.
DSJun 8, 2018
Robust Node Generation for Meshfree Discretizations on Irregular Domains and SurfacesVarun Shankar, Robert M. Kirby, Aaron L. Fogelson
We present a new algorithm for the automatic one-shot generation of scattered node sets on irregular 2D and 3D domains using Poisson disk sampling coupled to novel parameter-free, high-order parametric Spherical Radial Basis Function (SBF)-based geometric modeling of irregular domain boundaries. Our algorithm also automatically modifies the scattered node sets locally for time-varying embedded boundaries in the domain interior. We derive complexity estimates for our node generator in 2D and 3D that establish its scalability, and verify these estimates with timing experiments. We explore the influence of Poisson disk sampling parameters on both quasi-uniformity in the node sets and errors in an RBF-FD discretization of the heat equation. In all cases, our framework requires only a small number of "seed" nodes on domain boundaries. The entire framework exhibits O(N) complexity in both 2D and 3D.
FLU-DYNSep 23, 2022
Differentiable physics-enabled closure modeling for Burgers' turbulenceVarun Shankar, Vedant Puri, Ramesh Balakrishnan et al.
Data-driven turbulence modeling is experiencing a surge in interest following algorithmic and hardware developments in the data sciences. We discuss an approach using the differentiable physics paradigm that combines known physics with machine learning to develop closure models for Burgers' turbulence. We consider the 1D Burgers system as a prototypical test problem for modeling the unresolved terms in advection-dominated turbulence problems. We train a series of models that incorporate varying degrees of physical assumptions on an a posteriori loss function to test the efficacy of models across a range of system parameters, including viscosity, time, and grid resolution. We find that constraining models with inductive biases in the form of partial differential equations that contain known physics or existing closure approaches produces highly data-efficient, accurate, and generalizable models, outperforming state-of-the-art baselines. Addition of structure in the form of physics information also brings a level of interpretability to the models, potentially offering a stepping stone to the future of closure modeling.
FLU-DYNJul 25, 2023
Generalizable data-driven turbulence closure modeling on unstructured grids with differentiable physicsHojin Kim, Varun Shankar, Venkatasubramanian Viswanathan et al.
Differentiable physical simulators are proving to be valuable tools for developing data-driven models for computational fluid dynamics (CFD). In particular, these simulators enable end-to-end training of machine learning (ML) models embedded within CFD solvers. This paradigm enables novel algorithms which combine the generalization power and low cost of physics-based simulations with the flexibility and automation of deep learning methods. In this study, we introduce a framework for embedding deep learning models within a finite element solver for incompressible Navier-Stokes equations, specifically applying this approach to learn a subgrid-scale (SGS) closure with a graph neural network (GNN). We first demonstrate the feasibility of the approach on flow over a two-dimensional backward-facing step, using it as a proof of concept to show that solver-consistent training produces stable and physically meaningful closures. Then, we extend this to a turbulent flow over a three-dimensional backward-facing step. In this setting, the GNN-based closure not only attains low prediction errors, but also recovers key turbulence statistics and preserves multiscale turbulent structures. We further demonstrate that the closure can be identified in data-limited learning scenarios as well. Overall, the proposed end-to-end learning paradigm offers a viable pathway toward physically consistent and generalizable data-driven SGS modeling on complex and unstructured domains.
NAApr 10, 2018
Curvilinear Mesh Adaptation using Radial Basis Function Interpolation and SmoothingVidhi Zala, Varun Shankar, Shankar P. Sastry et al.
We present a new iterative technique based on radial basis function (RBF) interpolation and smoothing for the generation and smoothing of curvilinear meshes from straight-sided or other curvilinear meshes. Our technique approximates the coordinate deformation maps in both the interior and boundary of the curvilinear output mesh by using only scattered nodes on the boundary of the input mesh as data sites in an interpolation problem. Our technique produces high-quality meshes in the deformed domain even when the deformation maps are singular due to a new iterative algorithm based on modification of the RBF shape parameter. Due to the use of RBF interpolation, our technique is applicable to both 2D and 3D curvilinear mesh generation without significant modification.
LGMay 19, 2022
Accelerated Training of Physics-Informed Neural Networks (PINNs) using Meshless DiscretizationsRamansh Sharma, Varun Shankar
We present a new technique for the accelerated training of physics-informed neural networks (PINNs): discretely-trained PINNs (DT-PINNs). The repeated computation of partial derivative terms in the PINN loss functions via automatic differentiation during training is known to be computationally expensive, especially for higher-order derivatives. DT-PINNs are trained by replacing these exact spatial derivatives with high-order accurate numerical discretizations computed using meshless radial basis function-finite differences (RBF-FD) and applied via sparse-matrix vector multiplication. The use of RBF-FD allows for DT-PINNs to be trained even on point cloud samples placed on irregular domain geometries. Additionally, though traditional PINNs (vanilla-PINNs) are typically stored and trained in 32-bit floating-point (fp32) on the GPU, we show that for DT-PINNs, using fp64 on the GPU leads to significantly faster training times than fp32 vanilla-PINNs with comparable accuracy. We demonstrate the efficiency and accuracy of DT-PINNs via a series of experiments. First, we explore the effect of network depth on both numerical and automatic differentiation of a neural network with random weights and show that RBF-FD approximations of third-order accuracy and above are more efficient while being sufficiently accurate. We then compare the DT-PINNs to vanilla-PINNs on both linear and nonlinear Poisson equations and show that DT-PINNs achieve similar losses with 2-4x faster training times on a consumer GPU. Finally, we also demonstrate that similar results can be obtained for the PINN solution to the heat equation (a space-time problem) by discretizing the spatial derivatives using RBF-FD and using automatic differentiation for the temporal derivative. Our results show that fp64 DT-PINNs offer a superior cost-accuracy profile to fp32 vanilla-PINNs.
FLU-DYNJul 7, 2023
Differentiable Turbulence: Closure as a partial differential equation constrained optimizationVarun Shankar, Dibyajyoti Chakraborty, Venkatasubramanian Viswanathan et al.
Deep learning is increasingly becoming a promising pathway to improving the accuracy of sub-grid scale (SGS) turbulence closure models for large eddy simulations (LES). We leverage the concept of differentiable turbulence, whereby an end-to-end differentiable solver is used in combination with physics-inspired choices of deep learning architectures to learn highly effective and versatile SGS models for two-dimensional turbulent flow. We perform an in-depth analysis of the inductive biases in the chosen architectures, finding that the inclusion of small-scale non-local features is most critical to effective SGS modeling, while large-scale features can improve pointwise accuracy of the \textit{a-posteriori} solution field. The velocity gradient tensor on the LES grid can be mapped directly to the SGS stress via decomposition of the inputs and outputs into isotropic, deviatoric, and anti-symmetric components. We see that the model can generalize to a variety of flow configurations, including higher and lower Reynolds numbers and different forcing conditions. We show that the differentiable physics paradigm is more successful than offline, \textit{a-priori} learning, and that hybrid solver-in-the-loop approaches to deep learning offer an ideal balance between computational efficiency, accuracy, and generalization. Our experiments provide physics-based recommendations for deep-learning based SGS modeling for generalizable closure modeling of turbulence.
LGAug 14, 2023
Locally Adaptive and Differentiable RegressionMingxuan Han, Varun Shankar, Jeff M Phillips et al.
Over-parameterized models like deep nets and random forests have become very popular in machine learning. However, the natural goals of continuity and differentiability, common in regression models, are now often ignored in modern overparametrized, locally-adaptive models. We propose a general framework to construct a global continuous and differentiable model based on a weighted average of locally learned models in corresponding local regions. This model is competitive in dealing with data with different densities or scales of function values in different local regions. We demonstrate that when we mix kernel ridge and polynomial regression terms in the local models, and stitch them together continuously, we achieve faster statistical convergence in theory and improved performance in various practical settings.
FLU-DYNFeb 17
Fluids You Can Trust: Property-Preserving Operator Learning for Incompressible FlowsRamansh Sharma, Matthew Lowery, Houman Owhadi et al.
We present a novel property-preserving kernel-based operator learning method for incompressible flows governed by the incompressible Navier-Stokes equations. Traditional numerical solvers incur significant computational costs to respect incompressibility. Operator learning offers efficient surrogate models, but current neural operators fail to exactly enforce physical properties such as incompressibility, periodicity, and turbulence. Our method maps input functions to expansion coefficients of output functions in a property-preserving kernel basis, ensuring that predicted velocity fields analytically and simultaneously preserve the aforementioned physical properties. We evaluate the method on challenging 2D and 3D, laminar and turbulent, incompressible flow problems. Our method achieves up to six orders of magnitude lower relative $\ell_2$ errors upon generalization and trains up to five orders of magnitude faster compared to neural operators. Moreover, while our method enforces incompressibility analytically, neural operators exhibit very large deviations. Our results show that our method provides an accurate and efficient surrogate for incompressible flows.
ROMay 18
Neural Operators for Design-Space Surrogate Modeling of Tendon-Actuated Continuum RobotsBranden Frieden, James M. Ferguson, Alan Kuntz et al.
Continuum robots enable dexterous manipulation in constrained environments, but require accurate and efficient models for real-time manipulation and control. Traditional physics-based models can be computationally expensive and may suffer from inaccuracies due to unmodeled effects, while current learning-based methods often generalize poorly beyond the specific robot on which they are trained. We present a formulation of surrogate modeling for tendon-driven continuum robots as an operator learning problem that maps robot design parameters and tendon actuation inputs to resulting configurations. This formulation enables a single trained model to generalize across a large class of robot designs. We develop four novel neural operator architectures--two based on Deep Operator Networks (DeepONets) and two based on Fourier Neural Operators (FNOs)--and train them on simulation data to predict robot configurations. All architectures achieve good accuracy while allowing for fast and accurate generalization across designs. Our results demonstrate that operator learning provides an effective and generalizable surrogate for continuum robot mechanics in the design space, enabling fast modeling for control, planning, and design optimization in surgical and industrial applications.
MSApr 23
JetSCI: A Hybrid JAX-PETSc Framework for Scalable Differentiable SimulationAlberto Cattaneo, M Keith Ballard, Robert M. Kirby et al.
The rapid rise of scientific machine learning (SciML) has expanded the role of differentiable modeling, surrogate modeling, and data-driven constitutive laws in large-scale simulation. The JAX framework provides an attractive environment for these workflows through automatically differentiable programs, vectorization, GPU acceleration, and while enabling seamless learning of surrogate models. However, large-scale simulation still relies on mature HPC infrastructure. Libraries, such as PETSc, provide scalable MPI-based parallelism, robust linear and nonlinear solvers, and advanced preconditioning capabilities that remain difficult to reproduce in JAX-only workflows. We present JetSCI, a hybrid JAX-PETSc framework that unifies these complementary strengths. JetSCI uses JAX for GPU-parallel differentiable discretizations and PETSc for robust, scalable solution of the resulting systems on distributed-memory architectures, exposing multilevel parallelism through GPU acceleration within nodes and MPI parallelism across nodes. For finite element discretizations of heterogeneous micromechanics problems, JetSCI outperforms JAX-only implementations in efficiency and accuracy.
LGMay 20, 2024
Ensemble and Mixture-of-Experts DeepONets For Operator LearningRamansh Sharma, Varun Shankar
We present a novel deep operator network (DeepONet) architecture for operator learning, the ensemble DeepONet, that allows for enriching the trunk network of a single DeepONet with multiple distinct trunk networks. This trunk enrichment allows for greater expressivity and generalization capabilities over a range of operator learning problems. We also present a spatial mixture-of-experts (MoE) DeepONet trunk network architecture that utilizes a partition-of-unity (PoU) approximation to promote spatial locality and model sparsity in the operator learning problem. We first prove that both the ensemble and PoU-MoE DeepONets are universal approximators. We then demonstrate that ensemble DeepONets containing a trunk ensemble of a standard trunk, the PoU-MoE trunk, and/or a proper orthogonal decomposition (POD) trunk can achieve 2-4x lower relative $\ell_2$ errors than standard DeepONets and POD-DeepONets on both standard and challenging new operator learning problems involving partial differential equations (PDEs) in two and three dimensions. Our new PoU-MoE formulation provides a natural way to incorporate spatial locality and model sparsity into any neural network architecture, while our new ensemble DeepONet provides a powerful and general framework for incorporating basis enrichment in scientific machine learning architectures for operator learning.
LGOct 24, 2025
Deep Gaussian Processes for Functional MapsMatthew Lowery, Zhitong Xu, Da Long et al.
Learning mappings between functional spaces, also known as function-on-function regression, plays a crucial role in functional data analysis and has broad applications, e.g. spatiotemporal forecasting, curve prediction, and climate modeling. Existing approaches, such as functional linear models and neural operators, either fall short of capturing complex nonlinearities or lack reliable uncertainty quantification under noisy, sparse, and irregularly sampled data. To address these issues, we propose Deep Gaussian Processes for Functional Maps (DGPFM). Our method designs a sequence of GP-based linear and nonlinear transformations, leveraging integral transforms of kernels, GP interpolation, and nonlinear activations sampled from GPs. A key insight simplifies implementation: under fixed locations, discrete approximations of kernel integral transforms collapse into direct functional integral transforms, enabling flexible incorporation of various integral transform designs. To achieve scalable probabilistic inference, we use inducing points and whitening transformations to develop a variational learning algorithm. Empirical results on real-world and PDE benchmark datasets demonstrate that the advantage of DGPFM in both predictive performance and uncertainty calibration.
LGJun 30, 2024
Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator LearningMatthew Lowery, John Turnage, Zachary Morrow et al.
This paper introduces the Kernel Neural Operator (KNO), a provably convergent operator-learning architecture that utilizes compositions of deep kernel-based integral operators for function-space approximation of operators (maps from functions to functions). The KNO decouples the choice of kernel from the numerical integration scheme (quadrature), thereby naturally allowing for operator learning with explicitly-chosen trainable kernels on irregular geometries. On irregular domains, this allows the KNO to utilize domain-specific quadrature rules. To help ameliorate the curse of dimensionality, we also leverage an efficient dimension-wise factorization algorithm on regular domains. More importantly, the ability to explicitly specify kernels also allows the use of highly expressive, non-stationary, neural anisotropic kernels whose parameters are computed by training neural networks. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is comparable to or higher than popular operator learning techniques while typically using an order of magnitude fewer trainable parameters, with the more expressive kernels proving important to attaining high accuracy. KNOs thus facilitate low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning.
LGJun 4, 2024
Polynomial-Augmented Neural Networks (PANNs) with Weak Orthogonality Constraints for Enhanced Function and PDE ApproximationMadison Cooley, Shandian Zhe, Robert M. Kirby et al.
We present polynomial-augmented neural networks (PANNs), a novel machine learning architecture that combines deep neural networks (DNNs) with a polynomial approximant. PANNs combine the strengths of DNNs (flexibility and efficiency in higher-dimensional approximation) with those of polynomial approximation (rapid convergence rates for smooth functions). To aid in both stable training and enhanced accuracy over a variety of problems, we present (1) a family of orthogonality constraints that impose mutual orthogonality between the polynomial and the DNN within a PANN; (2) a simple basis pruning approach to combat the curse of dimensionality introduced by the polynomial component; and (3) an adaptation of a polynomial preconditioning strategy to both DNNs and polynomials. We test the resulting architecture for its polynomial reproduction properties, ability to approximate both smooth functions and functions of limited smoothness, and as a method for the solution of partial differential equations (PDEs). Through these experiments, we demonstrate that PANNs offer superior approximation properties to DNNs for both regression and the numerical solution of PDEs, while also offering enhanced accuracy over both polynomial and DNN-based regression (each) when regressing functions with limited smoothness.
FLU-DYNMay 3, 2023
Importance of equivariant and invariant symmetries for fluid flow modelingVarun Shankar, Shivam Barwey, Zico Kolter et al.
Graph neural networks (GNNs) have shown promise in learning unstructured mesh-based simulations of physical systems, including fluid dynamics. In tandem, geometric deep learning principles have informed the development of equivariant architectures respecting underlying physical symmetries. However, the effect of rotational equivariance in modeling fluids remains unclear. We build a multi-scale equivariant GNN to forecast fluid flow and study the effect of modeling invariant and non-invariant representations of the flow state. We evaluate the model performance of several equivariant and non-equivariant architectures on predicting the evolution of two fluid flows, flow around a cylinder and buoyancy-driven shear flow, to understand the effect of equivariance and invariance on data-driven modeling approaches. Our results show that modeling invariant quantities produces more accurate long-term predictions and that these invariant quantities may be learned from the velocity field using a data-driven encoder.