Towards Universal Solvers: Using PGD Attack in Active Learning to Increase Generalizability of Neural Operators as Knowledge Distillation from Numerical PDE Solvers
This addresses the problem of unreliable neural operators for scientific computing, offering a method to enhance generalizability, though it appears incremental as it builds on existing distillation and adversarial techniques.
The paper tackled the poor out-of-distribution generalization of neural operators like FNOs and DeepONets for solving nonlinear PDEs, proposing an adversarial teacher-student distillation framework that improved robustness while maintaining low parameter cost and fast inference, as demonstrated on Burgers and Navier-Stokes systems.
Nonlinear PDE solvers require fine space-time discretizations and local linearizations, leading to high memory cost and slow runtimes. Neural operators such as FNOs and DeepONets offer fast single-shot inference by learning function-to-function mappings and truncating high-frequency components, but they suffer from poor out-of-distribution (OOD) generalization, often failing on inputs outside the training distribution. We propose an adversarial teacher-student distillation framework in which a differentiable numerical solver supervises a compact neural operator while a PGD-style active sampling loop searches for worst-case inputs under smoothness and energy constraints to expand the training set. Using differentiable spectral solvers enables gradient-based adversarial search and stabilizes sample mining. Experiments on Burgers and Navier-Stokes systems demonstrate that adversarial distillation substantially improves OOD robustness while preserving the low parameter cost and fast inference of neural operators.