AO-PHJul 8, 2024
Leveraging data-driven weather models for improving numerical weather prediction skill through large-scale spectral nudgingSyed Zahid Husain, Leo Separovic, Jean-François Caron et al.
Operational meteorological forecasting has long relied on physics-based numerical weather prediction (NWP) models. Recently, this landscape has faced disruption by the advent of data-driven artificial intelligence (AI)-based weather models, which offer tremendous computational performance and competitive forecasting accuracy. However, data-driven models for medium-range forecasting generally suffer from major limitations, including low effective resolution and a narrow range of predicted variables. This study illustrates the relative strengths and weaknesses of these competing paradigms using the physics-based GEM (Global Environmental Multiscale) and the AI-based GraphCast models. Analyses of their respective global predictions in physical and spectral space reveal that GraphCast-predicted large scales outperform GEM, particularly for longer lead times, even though fine scales predicted by GraphCast suffer from excessive smoothing. Building on this insight, a hybrid NWP-AI system is proposed, wherein temperature and horizontal wind components predicted by GEM are spectrally nudged toward GraphCast predictions at large scales, while GEM itself freely generates the fine-scale details critical for local predictability and weather extremes. This hybrid approach is capable of leveraging the strengths of GraphCast to enhance the prediction skill of the GEM model while generating a full suite of physically consistent forecast fields with a full power spectrum. Additionally, trajectories of tropical cyclones are predicted with enhanced accuracy without significant changes in intensity. Work is in progress for operationalization of this hybrid system at the Canadian Meteorological Centre.
LGAug 26, 2024
Efficient fine-tuning of 37-level GraphCast with the Canadian global deterministic analysisChristopher Subich
This work describes a process for efficiently fine-tuning the GraphCast data-driven forecast model to simulate another analysis system, here the Global Deterministic Prediction System (GDPS) of Environment and Climate Change Canada (ECCC). Using two years of training data (July 2019 -- December 2021) and 37 GPU-days of computation to tune the 37-level, quarter-degree version of GraphCast, the resulting model significantly outperforms both the unmodified GraphCast and operational forecast, showing significant forecast skill in the troposphere over lead times from 1 to 10 days. This fine-tuning is accomplished through abbreviating DeepMind's original training curriculum for GraphCast, relying on a shorter single-step forecast stage to accomplish the bulk of the adaptation work and consolidating the autoregressive stages into separate 12hr, 1d, 2d, and 3d stages with larger learning rates. Additionally, training over 3d forecasts is split into two sub-steps to conserve host memory while maintaining a strong correlation with training over the full period.
LGJan 29
Learning to Advect: A Neural Semi-Lagrangian Architecture for Weather ForecastingCarlos A. Pereira, Stéphane Gaudreault, Valentin Dallerit et al.
Recent machine-learning approaches to weather forecasting often employ a monolithic architecture, where distinct physical mechanisms (advection, transport), diffusion-like mixing, thermodynamic processes, and forcing are represented implicitly within a single large network. This representation is particularly problematic for advection, where long-range transport must be treated with expensive global interaction mechanisms or through deep, stacked convolutional layers. To mitigate this, we present PARADIS, a physics-inspired global weather prediction model that imposes inductive biases on network behavior through a functional decomposition into advection, diffusion, and reaction blocks acting on latent variables. We implement advection through a Neural Semi-Lagrangian operator that performs trajectory-based transport via differentiable interpolation on the sphere, enabling end-to-end learning of both the latent modes to be transported and their characteristic trajectories. Diffusion-like processes are modeled through depthwise-separable spatial mixing, while local source terms and vertical interactions are modeled via pointwise channel interactions, enabling operator-level physical structure. PARADIS provides state-of-the-art forecast skill at a fraction of the training cost. On ERA5-based benchmarks, the 1 degree PARADIS model, with a total training cost of less than a GPU month, meets or exceeds the performance of 0.25 degree traditional and machine-learning baselines, including the ECMWF HRES forecast and DeepMind's GraphCast.
LGJan 31, 2025
Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss FunctionChristopher Subich, Syed Zahid Husain, Leo Separovic et al.
Recent advancements in data-driven weather forecasting models have delivered deterministic models that outperform the leading operational forecast systems based on traditional, physics-based models. However, these data-driven models are typically trained with a mean squared error loss function, which causes smoothing of fine scales through a "double penalty" effect. We develop a simple, parameter-free modification to this loss function that avoids this problem by separating the loss attributable to decorrelation from the loss attributable to spectral amplitude errors. Fine-tuning the GraphCast model with this new loss function results in sharp deterministic weather forecasts, an increase of the model's effective resolution from 1,250km to 160km, improvements to ensemble spread, and improvements to predictions of tropical cyclone strength and surface wind extremes.