CDLGMar 23, 2025

Regularization of ML models for Earth systems by using longer model timesteps

arXiv:2503.18023v11 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This provides an easy-to-implement regularization method for Earth system modeling, addressing overconfidence in predictions for chaotic systems, though it is incremental as it adapts an existing regularization concept to a specific domain.

The paper tackles the problem of improving generalization in machine learning models for chaotic Earth systems by using longer model timesteps as a form of regularization, demonstrating that a 28-day timestep yields realistic simulations on ORAS5 ocean reanalysis data.

Regularization is a technique to improve generalization of machine learning (ML) models. A common form of regularization in the ML literature is to train on data where similar inputs map to different outputs. This improves generalization by preventing ML models from becoming overconfident in their predictions. This paper shows how using longer timesteps when modelling chaotic Earth systems naturally leads to more of this regularization. We show this in two domains. We explain how using longer model timesteps can improve results and demonstrate that increased regularization is one of the causes. We explain why longer model timesteps lead to improved regularization in these systems and present a procedure to pick the model timestep. We also carry out a benchmarking exercise on ORAS5 ocean reanalysis data to show that a longer model timestep (28 days) than is typically used gives realistic simulations. We suggest that there will be many opportunities to use this type of regularization in Earth system problems because the Earth system is chaotic and the regularization is so easy to implement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes