LG NAMar 6, 2024

DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, Jun Zhu

Tsinghua

arXiv:2403.03542v437.6129 citationsh-index: 19Has CodeICML

Originality Highly original

AI Analysis

This work addresses the problem of data scarcity and complexity in PDE modeling for researchers and practitioners, representing a significant advance rather than an incremental improvement.

The authors tackled the challenge of pre-training neural operators for partial differential equations (PDEs) by introducing an auto-regressive denoising strategy and a scalable Fourier attention architecture, achieving state-of-the-art results on benchmarks and enhancing performance on downstream tasks like 3D data with a model trained on over 100k trajectories.

Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. Moreover, by designing a flexible and scalable model architecture based on Fourier attention, we can easily scale up the model for large-scale pre-training. We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories. Extensive experiments show that we achieve SOTA on these benchmarks and validate the strong generalizability of our model to significantly enhance performance on diverse downstream PDE tasks like 3D data. Code is available at \url{https://github.com/thu-ml/DPOT}.

View on arXiv PDF Code

Similar