FlowDA: Accurate, Low-Latency Weather Data Assimilation via Flow Matching
This addresses a major computational problem in weather prediction for meteorologists and ML practitioners, offering an incremental improvement over existing generative methods.
The paper tackled the computational bottleneck in weather data assimilation by proposing FlowDA, a flow matching-based framework that conditions on observations and fine-tunes the Aurora foundation model, achieving superior performance across observation rates from 3.9% to 0.1% with robustness to noise and stable long-horizon cycling.
Data assimilation (DA) is a fundamental component of modern weather prediction, yet it remains a major computational bottleneck in machine learning (ML)-based forecasting pipelines due to reliance on traditional variational methods. Recent generative ML-based DA methods offer a promising alternative but typically require many sampling steps and suffer from error accumulation under long-horizon auto-regressive rollouts with cycling assimilation. We propose FlowDA, a low-latency weather-scale generative DA framework based on flow matching. FlowDA conditions on observations through a SetConv-based embedding and fine-tunes the Aurora foundation model to deliver accurate, efficient, and robust analyses. Experiments across observation rates decreasing from $3.9\%$ to $0.1\%$ demonstrate superior performance of FlowDA over strong baselines with similar tunable-parameter size. FlowDA further shows robustness to observational noise and stable performance in long-horizon auto-regressive cycling DA. Overall, FlowDA points to an efficient and scalable direction for data-driven DA.