LGJun 24, 2025

Maximal Update Parametrization and Zero-Shot Hyperparameter Transfer for Fourier Neural Operators

arXiv:2506.19396v19.43 citationsh-index: 7ICML

Originality Highly original

AI Analysis

This work addresses a bottleneck in scaling neural operators for solving complex PDEs, offering a practical solution for researchers and engineers in computational science.

The paper tackles the computational impracticality of hyperparameter tuning for large-scale Fourier Neural Operators (FNOs) by introducing μTransfer-FNO, a zero-shot transfer technique that reduces tuning costs while maintaining or improving accuracy across various PDEs.

Fourier Neural Operators (FNOs) offer a principled approach for solving complex partial differential equations (PDEs). However, scaling them to handle more complex PDEs requires increasing the number of Fourier modes, which significantly expands the number of model parameters and makes hyperparameter tuning computationally impractical. To address this, we introduce $μ$Transfer-FNO, a zero-shot hyperparameter transfer technique that enables optimal configurations, tuned on smaller FNOs, to be directly applied to billion-parameter FNOs without additional tuning. Building on the Maximal Update Parametrization ($μ$P) framework, we mathematically derive a parametrization scheme that facilitates the transfer of optimal hyperparameters across models with different numbers of Fourier modes in FNOs, which is validated through extensive experiments on various PDEs. Our empirical study shows that Transfer-FNO reduces computational cost for tuning hyperparameters on large FNOs while maintaining or improving accuracy.

View on arXiv PDF

Similar