Peer Nowack

LG
h-index48
5papers
53citations
Novelty41%
AI Score44

5 Papers

LGNov 7, 2023
ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning

Julia Kaltenborn, Charlotte E. E. Lange, Venkatesh Ramesh et al. · mila

Climate models have been key for assessing the impact of climate change and simulating future climate scenarios. The machine learning (ML) community has taken an increased interest in supporting climate scientists' efforts on various tasks such as climate model emulation, downscaling, and prediction tasks. Many of those tasks have been addressed on datasets created with single climate models. However, both the climate science and ML communities have suggested that to address those tasks at scale, we need large, consistent, and ML-ready climate model datasets. Here, we introduce ClimateSet, a dataset containing the inputs and outputs of 36 climate models from the Input4MIPs and CMIP6 archives. In addition, we provide a modular dataset pipeline for retrieving and preprocessing additional climate models and scenarios. We showcase the potential of our dataset by using it as a benchmark for ML-based climate model emulation. We gain new insights about the performance and generalization capabilities of the different ML models by analyzing their performance across different climate models. Furthermore, the dataset can be used to train an ML emulator on several climate models instead of just one. Such a "super emulator" can quickly project new climate change scenarios, complementing existing scenarios already provided to policymakers. We believe ClimateSet will create the basis needed for the ML community to tackle climate-related tasks at scale.

47.9LGMay 16
Emulating the Forced Response of Climate Models with Flow Matching

Graham Clyne, Julia Kaltenborn, Peer Nowack et al.

Global climate models are essential tools to simulate past and potential future pathways of climate change, as well as associated climate impacts. Shared Socioeconomic Pathways (SSPs) describe a range of future scenarios of global economic and demographic development. These SSPs are intrinsically linked to changes in climate forcings, the external drivers, such as greenhouse gas and aerosol emissions, which in turn lead to the human impact on the energy balance of the Earth over time. These forcings are fundamental boundary conditions in climate models in order to gain insight into the potential climatic impacts of these changes described by each SSP. Running a climate model, however, is extremely computationally expensive, conflicting with the need for large ensembles of simulations for each model to give, e.g., more robust estimates in the presence of internal variability (the inherent, chaotic fluctuations within the climate system) and scenario uncertainty. Recent research has demonstrated the ability to capture climate model dynamics using machine learning when conditioned on forcings from different climatic scenarios. We here train a Deep Learning (DL) model on multiple SSPs and successfully generate scenarios unseen during training. Our emulator is validated against MESMER-M, a statistical emulator of land surface temperature. Our research demonstrates the capacity to generate such changing climate states in response to a variety of simultaneous climate forcings (e.g., carbon dioxide, methane, nitrous oxide, sulphate aerosols, and ozone). In particular, our ablation studies underline a need to include a range of different forcings to represent long-term atmospheric trends with a DL emulator.

LGDec 5, 2023
Towards Causal Representations of Climate Model Data

Julien Boussard, Chandni Nagda, Julia Kaltenborn et al.

Climate models, such as Earth system models (ESMs), are crucial for simulating future climate change based on projected Shared Socioeconomic Pathways (SSP) greenhouse gas emissions scenarios. While ESMs are sophisticated and invaluable, machine learning-based emulators trained on existing simulation data can project additional climate scenarios much faster and are computationally efficient. However, they often lack generalizability and interpretability. This work delves into the potential of causal representation learning, specifically the \emph{Causal Discovery with Single-parent Decoding} (CDSD) method, which could render climate model emulation efficient \textit{and} interpretable. We evaluate CDSD on multiple climate datasets, focusing on emissions, temperature, and precipitation. Our findings shed light on the challenges, limitations, and promise of using CDSD as a stepping stone towards more interpretable and robust climate model emulation.

LGJun 11, 2025
Causal Climate Emulation with Bayesian Filtering

Sebastian Hickman, Ilija Trajkovic, Julia Kaltenborn et al.

Traditional models of climate change use complex systems of coupled equations to simulate physical processes across the Earth system. These simulations are highly computationally expensive, limiting our predictions of climate change and analyses of its causes and effects. Machine learning has the potential to quickly emulate data from climate models, but current approaches are not able to incorporate physically-based causal relationships. Here, we develop an interpretable climate model emulator based on causal representation learning. We derive a novel approach including a Bayesian filter for stable long-term autoregressive emulation. We demonstrate that our emulator learns accurate climate dynamics, and we show the importance of each one of its components on a realistic synthetic dataset and data from two widely deployed climate models.

LGSep 24, 2025
mloz: A Highly Efficient Machine Learning-Based Ozone Parameterization for Climate Sensitivity Simulations

Yiling Ma, Nathan Luke Abraham, Stefan Versick et al.

Atmospheric ozone is a crucial absorber of solar radiation and an important greenhouse gas. However, most climate models participating in the Coupled Model Intercomparison Project (CMIP) still lack an interactive representation of ozone due to the high computational costs of atmospheric chemistry schemes. Here, we introduce a machine learning parameterization (mloz) to interactively model daily ozone variability and trends across the troposphere and stratosphere in standard climate sensitivity simulations, including two-way interactions of ozone with the Quasi-Biennial Oscillation. We demonstrate its high fidelity on decadal timescales and its flexible use online across two different climate models -- the UK Earth System Model (UKESM) and the German ICOsahedral Nonhydrostatic (ICON) model. With atmospheric temperature profile information as the only input, mloz produces stable ozone predictions around 31 times faster than the chemistry scheme in UKESM, contributing less than 4 percent of the respective total climate model runtimes. In particular, we also demonstrate its transferability to different climate models without chemistry schemes by transferring the parameterization from UKESM to ICON. This highlights the potential for widespread adoption in CMIP-level climate models that lack interactive chemistry for future climate change assessments, particularly when focusing on climate sensitivity simulations, where ozone trends and variability are known to significantly modulate atmospheric feedback processes.