Physics-Guided Transformer (PGT): Physics-Aware Attention Mechanism for PINNs
This addresses the problem of physical inconsistency and instability in physics-informed neural networks for scientific machine learning, offering a novel architecture that improves performance under data-scarce conditions.
The paper tackles the challenge of reconstructing physical fields from sparse observations governed by PDEs by introducing the Physics-Guided Transformer (PGT), which embeds physical structure into self-attention. Results show PGT achieves a relative L2 error of 5.9e-3 in 1D heat equation reconstruction and low PDE residual (8.3e-4) with competitive error (0.034) in 2D Navier-Stokes, outperforming existing methods.
Reconstructing continuous physical fields from sparse, irregular observations is a central challenge in scientific machine learning, particularly for systems governed by partial differential equations (PDEs). Existing physics-informed methods typically enforce governing equations as soft penalty terms during optimization, often leading to gradient imbalance, instability, and degraded physical consistency under limited data. We introduce the Physics-Guided Transformer (PGT), a neural architecture that embeds physical structure directly into the self-attention mechanism. Specifically, PGT incorporates a heat-kernel-derived additive bias into attention logits, encoding diffusion dynamics and temporal causality within the representation. Query coordinates attend to these physics-conditioned context tokens, and the resulting features are decoded using a FiLM-modulated sinusoidal implicit network that adaptively controls spectral response. We evaluate PGT on the one-dimensional heat equation and two-dimensional incompressible Navier-Stokes systems. In sparse 1D reconstruction with 100 observations, PGT achieves a relative L2 error of 5.9e-3, significantly outperforming both PINNs and sinusoidal representations. In the 2D cylinder wake problem, PGT uniquely achieves both low PDE residual (8.3e-4) and competitive relative error (0.034), outperforming methods that optimize only one objective. These results demonstrate that embedding physics within attention improves stability, generalization, and physical fidelity under data-scarce conditions.