The physics of AI weather models
Provides a theoretical framework for understanding how AI weather models work, which could guide future model design for meteorologists and AI researchers.
The authors investigate whether AI weather models implicitly solve physical equations, finding evidence that different models represent the atmosphere similarly and hypothesize a particle-based gradient flow mechanism. Analysis of GraphCast and Aurora shows large-to-small scale processing consistent with this hypothesis.
Could it be that AI weather models are solving physical equations, although they may not be the equations used by conventional NWP models? We compute correlations of forecast skill and Centered Kernel Alignment, providing evidence that different AI weather models represent the atmosphere in similar ways, despite differences in architecture and capacity. We argue that the architecture and training of the AI models constrains the form of the physical laws that they might simulate. In particular, we propose that the models implement a particle description of the atmosphere, where the latent variables at each mesh point correspond to the position of a particle in the high dimensional latent space. We hypothesize that the movement of the particles follows a gradient flow in the latent space towards a minimum of a learned free energy functional. Analysis of the GraphCast and Aurora models show that they make changes on large spatial scales in the early processor layers and move to smaller scale with increasing layer depth, consistent with the gradient flow hypothesis.