PhysMetrics.Weather: An Evaluation Framework for Physical Consistency in ML Weather Models
For researchers and developers of ML weather models, this framework provides a tool to evaluate physical realism, guiding the development of more reliable models for operational forecasting.
The paper introduces PhysMetrics.Weather, an evaluation framework to assess physical consistency of ML weather prediction models using conservation, spectral, and dynamical metrics, addressing the lack of physical realism guarantees in data-driven models.
Machine learning weather prediction (MLWP) models have achieved impressive forecasting performance at a small fraction of the computational costs required for traditional physics-based methods. However, they are primarily (1) data-driven and (2) evaluated using pixel-wide error metrics (e.g., RMSE), so there are no guarantees that their forecasts are consistent with known physical laws. We introduce PhysMetrics.Weather, an evaluation framework that assesses the physical realism of MLWP models across three types of metrics: conservation, spectral, and dynamical. By quantifying physical realism, this tool guides the development of physics-informed architectures and helps evaluate whether MLWP models are reliable for operational use. Our framework is available on Github at https://github.com/Emmakast/PhysMetrics.Weather.