(Sparse) Attention to the Details: Preserving Spectral Fidelity in ML-based Weather Forecasting Models

arXiv:2604.1642996.61 citationsh-index: 11
AI Analysis

For operational weather forecasting, Mosaic provides a computationally efficient method to produce well-calibrated ensembles with preserved spectral fidelity, overcoming key limitations of current ML models.

Mosaic addresses spectral degradation in ML-based weather forecasting caused by deterministic training and compressive encoding. It achieves state-of-the-art results among 1.5° models, with near-perfect spectral alignment and a 24-member 10-day forecast in under 12 seconds on a single H100 GPU.

We introduce Mosaic, a probabilistic weather forecasting model that addresses two principal sources of spectral degradation in ML-based weather prediction: (1) deterministic training against ensemble means and (2) compressive encoding creating an information bottleneck. Mosaic generates ensemble members through learned functional perturbations and operates on native-resolution grids via block-sparse attention, a hardware-aligned mechanism that captures long-range dependencies at linear cost by sharing keys and values across spatially adjacent queries. At 1.5$°$ resolution with 214M parameters, Mosaic matches or outperforms models trained on 6 times finer data on headline upper-air variables and achieves state-of-the-art results among 1.5$°$ models, producing well-calibrated ensembles whose individual members exhibit near-perfect spectral alignment across all resolved frequencies. A 24-member, 10-day forecast takes under 12 seconds on a single H100 GPU.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes