CDSTAT-MECHLGAO-PHJun 27, 2025

Probing forced responses and causality in data-driven climate emulators: conceptual limitations and the role of reduced-order models

arXiv:2506.22552v56 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses limitations in climate science for researchers developing models to study causal mechanisms, but it is incremental as it builds on existing linear response theory and coarse-graining approaches.

The paper tackles the challenge of data-driven climate emulators failing to reproduce forced responses for causal studies, and finds that using reduced-order models with coarse-grained representations and parameterizations improves this, as demonstrated in a real-world application where a neural model infers multiplicative noise and largely reproduces probability distributions.

A central challenge in climate science and applied mathematics is developing data-driven models of multiscale systems that capture both stationary statistics and responses to external perturbations. Current neural climate emulators aim to resolve the atmosphere-ocean system in all its complexity but often struggle to reproduce forced responses, limiting their use in causal studies such as Green's function experiments. To investigate the origin of these limitations, we first examine a simplified dynamical system that retains key features of climate variability. We interpret the results through linear response theory, providing a rigorous framework to evaluate neural models beyond stationary statistics and probe causal mechanisms. We argue that the ability of multiscale systems' emulators to reproduce perturbed statistics depends critically on (i) identifying an appropriate coarse-grained representation and (ii) careful parameterizations of unresolved processes. For low-frequency climate dynamics, these insights highlight reduced-order models, tailored to specific processes and scales, as valuable alternatives to general-purpose emulators. We next consider a real-world application, developing a neural model to investigate the joint variability of the surface temperature field and radiative fluxes. The model infers a multiplicative noise process directly from data, largely reproduces the system's probability distribution, and enables causal studies through forced responses. We discuss its limitations and outline directions for future work. These results expose key challenges in data-driven modeling of multiscale physical systems and underscore the value of coarse-grained, stochastic approaches, with response theory as a principled framework to guide model design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes