LGApr 18, 2023
W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecastingXin Man, Chenghong Zhang, Jin Feng et al.
Weather forecasting is a long-standing computational challenge with direct societal and economic impacts. This task involves a large amount of continuous data collection and exhibits rich spatiotemporal dependencies over long periods, making it highly suitable for deep learning models. In this paper, we apply pre-training techniques to weather forecasting and propose W-MAE, a Weather model with Masked AutoEncoder pre-training for weather forecasting. W-MAE is pre-trained in a self-supervised manner to reconstruct spatial correlations within meteorological variables. On the temporal scale, we fine-tune the pre-trained W-MAE to predict the future states of meteorological variables, thereby modeling the temporal dependencies present in weather data. We conduct our experiments using the fifth-generation ECMWF Reanalysis (ERA5) data, with samples selected every six hours. Experimental results show that our W-MAE framework offers three key benefits: 1) when predicting the future state of meteorological variables, the utilization of our pre-trained W-MAE can effectively alleviate the problem of cumulative errors in prediction, maintaining stable performance in the short-to-medium term; 2) when predicting diagnostic variables (e.g., total precipitation), our model exhibits significant performance advantages over FourCastNet; 3) Our task-agnostic pre-training schema can be easily integrated with various task-specific models. When our pre-training framework is applied to FourCastNet, it yields an average 20% performance improvement in Anomaly Correlation Coefficient (ACC).
44.7AO-PHMar 28
StretchCast: Global-Regional AI Weather Forecasting on Stretched Cubed-Sphere MeshJin Feng
Global AI weather forecasting still relies mainly on uniform-resolution models, making it hard to combine regional refinement, two-way regional-global coupling, and affordable training cost. We introduce StretchCast, a global-regional AI forecasting framework built on a variable-resolution stretched cubed-sphere (SCS) mesh that preserves a closed global domain while concentrating resolution over a target region. Within this framework, we develop a one-step predictor, SCS_Base Model, and a rollout-oriented multistep predictor, SCS_FCST4 Model, to test the feasibility of SCS-based forecasting and the benefit of joint multistep training. Experiments use ERA5 with 69 variables over 1998-2022. Because training compute remains limited, this study uses a coarse-resolution proof-of-concept configuration rather than a final high-resolution system. Even with only about 7,776 effective global grid cells and roughly 0.875 degree resolution over the center-refined face, the 23M-parameter SCS_Base Model yields stable multivariate forecasts. With 83M parameters and training cost on the order of hours, SCS_FCST4 Model delivers competitive medium-range anomaly-correlation evolution over the target region after unified reprojection, especially for geopotential height, specific humidity, and part of the lower-tropospheric winds, while maintaining smooth cross-face continuity and realistic multiscale structure in typhoon and spectral analyses. These results support StretchCast as a practical lightweight foundation for global-regional AI weather forecasting.
CVSep 17, 2025
MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and OutlookPeng Xu, Shengwu Xiong, Jiajun Zhang et al.
This paper reviews the MARS2 2025 Challenge on Multimodal Reasoning. We aim to bring together different approaches in multimodal machine learning and LLMs via a large benchmark. We hope it better allows researchers to follow the state-of-the-art in this very dynamic area. Meanwhile, a growing number of testbeds have boosted the evolution of general-purpose large language models. Thus, this year's MARS2 focuses on real-world and specialized scenarios to broaden the multimodal reasoning applications of MLLMs. Our organizing team released two tailored datasets Lens and AdsQA as test sets, which support general reasoning in 12 daily scenarios and domain-specific reasoning in advertisement videos, respectively. We evaluated 40+ baselines that include both generalist MLLMs and task-specific models, and opened up three competition tracks, i.e., Visual Grounding in Real-world Scenarios (VG-RS), Visual Question Answering with Spatial Awareness (VQA-SA), and Visual Reasoning in Creative Advertisement Videos (VR-Ads). Finally, 76 teams from the renowned academic and industrial institutions have registered and 40+ valid submissions (out of 1200+) have been included in our ranking lists. Our datasets, code sets (40+ baselines and 15+ participants' methods), and rankings are publicly available on the MARS2 workshop website and our GitHub organization page https://github.com/mars2workshop/, where our updates and announcements of upcoming events will be continuously provided.
AO-PHNov 19, 2024
Leadsee-Precip: A Deep Learning Diagnostic Model for PrecipitationWeiwen Ji, Jin Feng, Yueqi Liu et al.
Recently, deep-learning weather forecasting models have surpassed traditional numerical models in terms of the accuracy of meteorological variables. However, there is considerable potential for improvements in precipitation forecasts, especially for heavy precipitation events. To address this deficiency, we propose Leadsee-Precip, a global deep learning model to generate precipitation from meteorological circulation fields. The model utilizes an information balance scheme to tackle the challenges of predicting heavy precipitation caused by the long-tail distribution of precipitation data. Additionally, more accurate satellite and radar-based precipitation retrievals are used as training targets. Compared to artificial intelligence global weather models, the heavy precipitation from Leadsee-Precip is more consistent with observations and shows competitive performance against global numerical weather prediction models. Leadsee-Precip can be integrated with any global circulation model to generate precipitation forecasts. But the deviations between the predicted and the ground-truth circulation fields may lead to a weakened precipitation forecast, which could potentially be mitigated by further fine-tuning based on the predicted circulation fields.