Shyam Sankaran

h-index53

6papers

507citations

Novelty62%

AI Score51

Ranked #36,647 of 205,806 authors (top 18%)#8,355 in LG (top 20%)

6 Papers

LGMar 14, 2022

Respecting causality is all you need for training physics-informed neural networks

Sifan Wang, Shyam Sankaran, Paris Perdikaris

While the popularity of physics-informed neural networks (PINNs) is steadily rising, to this date PINNs have not been successful in simulating dynamical systems whose solution exhibits multi-scale, chaotic or turbulent behavior. In this work we attribute this shortcoming to the inability of existing PINNs formulations to respect the spatio-temporal causal structure that is inherent to the evolution of physical systems. We argue that this is a fundamental limitation and a key source of error that can ultimately steer PINN models to converge towards erroneous solutions. We address this pathology by proposing a simple re-formulation of PINNs loss functions that can explicitly account for physical causality during model training. We demonstrate that this simple modification alone is enough to introduce significant accuracy improvements, as well as a practical quantitative mechanism for assessing the convergence of a PINNs model. We provide state-of-the-art numerical results across a series of benchmarks for which existing PINNs formulations fail, including the chaotic Lorenz system, the Kuramoto-Sivashinsky equation in the chaotic regime, and the Navier-Stokes equations in the turbulent regime. To the best of our knowledge, this is the first time that PINNs have been successful in simulating such systems, introducing new opportunities for their applicability to problems of industrial complexity.

LGAug 16, 2023

An Expert's Guide to Training Physics-informed Neural Networks

Sifan Wang, Shyam Sankaran, Hanwen Wang et al.

Physics-informed neural networks (PINNs) have been popularized as a deep learning framework that can seamlessly synthesize observational data and partial differential equation (PDE) constraints. Their practical effectiveness however can be hampered by training pathologies, but also oftentimes by poor choices made by users who lack deep learning expertise. In this paper we present a series of best practices that can significantly improve the training efficiency and overall accuracy of PINNs. We also put forth a series of challenging benchmark problems that highlight some of the most prominent difficulties in training PINNs, and present comprehensive and fully reproducible ablation studies that demonstrate how different architecture choices and training strategies affect the test accuracy of the resulting models. We show that the methods and guiding principles put forth in this study lead to state-of-the-art results and provide strong baselines that future studies should use for comparison purposes. To this end, we also release a highly optimized library in JAX that can be used to reproduce all results reported in this paper, enable future research studies, as well as facilitate easy adaptation to new use-case scenarios.

CESep 23, 2024

Micrometer: Micromechanics Transformer for Predicting Mechanical Responses of Heterogeneous Materials

Sifan Wang, Tong-Rui Liu, Shyam Sankaran et al.

Heterogeneous materials, crucial in various engineering applications, exhibit complex multiscale behavior, which challenges the effectiveness of traditional computational methods. In this work, we introduce the Micromechanics Transformer ({\em Micrometer}), an artificial intelligence (AI) framework for predicting the mechanical response of heterogeneous materials, bridging the gap between advanced data-driven methods and complex solid mechanics problems. Trained on a large-scale high-resolution dataset of 2D fiber-reinforced composites, Micrometer can achieve state-of-the-art performance in predicting microscale strain fields across a wide range of microstructures, material properties under any loading conditions and We demonstrate the accuracy and computational efficiency of Micrometer through applications in computational homogenization and multiscale modeling, where Micrometer achieves 1\% error in predicting macroscale stress fields while reducing computational time by up to two orders of magnitude compared to conventional numerical solvers. We further showcase the adaptability of the proposed model through transfer learning experiments on new materials with limited data, highlighting its potential to tackle diverse scenarios in mechanical analysis of solid materials. Our work represents a significant step towards AI-driven innovation in computational solid mechanics, addressing the limitations of traditional numerical methods and paving the way for more efficient simulations of heterogeneous materials across various industrial applications.

90.2LGMay 25

Small Models, Strong Priors: Architectural Inductive Bias for Parameter-Efficient Neural PDE Solvers

Shyam Sankaran, Hanwen Wang, Paris Perdikaris

Neural PDE solvers have followed the scaling trajectory of vision and language, with recent foundation models reaching billions of parameters. We argue that scale is a poor substitute for architectural inductive bias in this domain: structured priors deliver outsized parameter efficiency, and the pattern of where they succeed and fail is itself informative about what they capture. We instantiate this argument in WaveLiT, an architecture combining a discrete wavelet transform for lossless multi-resolution tokenization, an augmented linear attention block, a shared-weight multiscale feature pyramid, and a wavelet-domain auxiliary loss. Bespoke 1-10M-parameter WaveLiT models compete with foundation models of 100-1000$\times$ their size across eight TheWell benchmarks, with the largest gains on wave and acoustic-dominated benchmarks where the wavelet-multiscale prior fits the dominant dynamical structure and small per-step errors do not compound geometrically under rollout. Trained jointly across all eight benchmarks, a 10M-parameter foundation variant exhibits a structured, physically interpretable transfer pattern -- strongest where the wavelet-multiscale prior matches the dynamics, weakest on chaotic advection-dominated flows. The entire pipeline trains on a single GPU. The results suggest that small-model PDE performance is shaped by architectural inductive bias rather than scale, and that the structure of a prior's failures is a useful empirical signal about its content.

LGMay 22, 2024

CViT: Continuous Vision Transformer for Operator Learning

Sifan Wang, Jacob H Seidman, Shyam Sankaran et al.

Operator learning, which aims to approximate maps between infinite-dimensional function spaces, is an important area in scientific machine learning with applications across various physical domains. Here we introduce the Continuous Vision Transformer (CViT), a novel neural operator architecture that leverages advances in computer vision to address challenges in learning complex physical systems. CViT combines a vision transformer encoder, a novel grid-based coordinate embedding, and a query-wise cross-attention mechanism to effectively capture multi-scale dependencies. This design allows for flexible output representations and consistent evaluation at arbitrary resolutions. We demonstrate CViT's effectiveness across a diverse range of partial differential equation (PDE) systems, including fluid dynamics, climate modeling, and reaction-diffusion processes. Our comprehensive experiments show that CViT achieves state-of-the-art performance on multiple benchmarks, often surpassing larger foundation models, even without extensive pretraining and roll-out fine-tuning. Taken together, CViT exhibits robust handling of discontinuous solutions, multi-scale features, and intricate spatio-temporal dynamics. Our contributions can be viewed as a significant step towards adapting advanced computer vision architectures for building more flexible and accurate machine learning models in the physical sciences.

LGJul 11, 2025

Simulating Three-dimensional Turbulence with Physics-informed Neural Networks

Sifan Wang, Shyam Sankaran, Xiantao Fan et al.

Turbulent fluid flows are among the most computationally demanding problems in science, requiring enormous computational resources that become prohibitive at high flow speeds. Physics-informed neural networks (PINNs) represent a radically different approach that trains neural networks directly from physical equations rather than data, offering the potential for continuous, mesh-free solutions. Here we show that appropriately designed PINNs can successfully simulate fully turbulent flows in both two and three dimensions, directly learning solutions to the fundamental fluid equations without traditional computational grids or training data. Our approach combines several algorithmic innovations including adaptive network architectures, causal training, and advanced optimization methods to overcome the inherent challenges of learning chaotic dynamics. Through rigorous validation on challenging turbulence problems, we demonstrate that PINNs accurately reproduce key flow statistics including energy spectra, kinetic energy, enstrophy, and Reynolds stresses. Our results demonstrate that neural equation solvers can handle complex chaotic systems, opening new possibilities for continuous turbulence modeling that transcends traditional computational limitations.