LGJun 23, 2025

Finetuning a Weather Foundation Model with Lightweight Decoders for Unseen Physical Processes

Fanny Lehmann, Firat Ozdemir, Benedikt Soja, Torsten Hoefler, Siddhartha Mishra, Sebastian Schemm

arXiv:2506.19088v17.13 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses the challenge of making foundation models more accessible for Earth sciences by enabling efficient adaptation to new variables without full fine-tuning, though it is incremental as it builds on existing models.

The study tackled the problem of extending a weather foundation model to predict unseen hydrological variables by introducing a lightweight decoder approach, which reduced training time by 50% and memory by 35% while maintaining strong accuracy and preserving model properties like autoregressive stability.

Recent advances in AI weather forecasting have led to the emergence of so-called "foundation models", typically defined by expensive pretraining and minimal fine-tuning for downstream tasks. However, in the natural sciences, a desirable foundation model should also encode meaningful statistical relationships between the underlying physical variables. This study evaluates the performance of the state-of-the-art Aurora foundation model in predicting hydrological variables, which were not considered during pretraining. We introduce a lightweight approach using shallow decoders trained on the latent representations of the pretrained model to predict these new variables. As a baseline, we compare this to fine-tuning the full model, which allows further optimization of the latent space while incorporating new variables into both inputs and outputs. The decoder-based approach requires 50% less training time and 35% less memory, while achieving strong accuracy across various hydrological variables and preserving desirable properties of the foundation model, such as autoregressive stability. Notably, decoder accuracy depends on the physical correlation between the new variables and those used during pretraining, indicating that Aurora's latent space captures meaningful physical relationships. In this sense, we argue that an important quality metric for foundation models in Earth sciences is their ability to be extended to new variables without a full fine-tuning. This provides a new perspective for making foundation models more accessible to communities with limited computational resources, while supporting broader adoption in Earth sciences.

View on arXiv PDF

Similar