Physical Consistency of Aurora's Encoder: A Quantitative Study
This work addresses the lack of transparency in large-scale weather forecasting models like Aurora, which hinders their adoption in high-stakes operational settings, by providing interpretability methods to validate AI-driven models.
The study investigated whether Aurora's encoder learns physically consistent features by training linear classifiers on its embeddings to identify land-sea boundaries, extreme temperature events, and atmospheric instability, finding quantitative evidence of consistency but limitations with rare events.
The high accuracy of large-scale weather forecasting models like Aurora is often accompanied by a lack of transparency, as their internal representations remain largely opaque. This "black box" nature hinders their adoption in high-stakes operational settings. In this work, we probe the physical consistency of Aurora's encoder by investigating whether its latent representations align with known physical and meteorological concepts. Using a large-scale dataset of embeddings, we train linear classifiers to identify three distinct concepts: the fundamental land-sea boundary, high-impact extreme temperature events, and atmospheric instability. Our findings provide quantitative evidence that Aurora learns physically consistent features, while also highlighting its limitations in capturing the rarest events. This work underscores the critical need for interpretability methods to validate and build trust in the next generation of Al-driven weather models.