CDApr 14
Physics and causally constrained discrete-time neural models of turbulent dynamical systemsFabrizio Falasca, Laure Zanna
We present a framework for constructing physics and causally constrained neural models of turbulent dynamical systems from data. We first formulate a finite-time flow map with strict energy-preserving nonlinearities for stable modeling of temporally discrete trajectories. We then impose causal constraints to suppress spurious interactions across degrees of freedom. The resulting neural models accurately capture stationary statistics and responses to both small and large external forcings. We demonstrate the framework on the stochastic Charney-DeVore equations and on a symmetry-broken Lorenz-96 system. The framework is broadly applicable to reduced-order modeling of turbulent dynamical systems from observational data.
MLApr 17, 2025
Predicting Forced Responses of Probability Distributions via the Fluctuation-Dissipation Theorem and Generative ModelingLudovico T. Giorgini, Fabrizio Falasca, Andre N. Souza
We present a novel and flexible data-driven framework for estimating the response of higher-order moments of nonlinear stochastic systems to small external perturbations. The classical Generalized Fluctuation--Dissipation Theorem (GFDT) links the unperturbed steady-state distribution to the system's linear response. While standard implementations relying on Gaussian approximations can predict the mean response, they often fail to capture changes in higher-order moments. To overcome this, we combine GFDT with score-based generative modeling to estimate the system's score function directly from data. We demonstrate the framework's versatility by employing two complementary score estimation techniques tailored to the system's characteristics: (i) a clustering-based algorithm (KGMM) for systems with low-dimensional effective dynamics, and (ii) a denoising score matching method implemented with a U-Net architecture for high-dimensional, spatially-extended systems where reduced-order modeling is not feasible. Our method is validated on several stochastic models relevant to climate dynamics: three reduced-order models of increasing complexity and a 2D Navier--Stokes model representing a turbulent flow with a localized perturbation. In all cases, the approach accurately captures strongly nonlinear and non-Gaussian features of the system's response, significantly outperforming traditional Gaussian approximations.
CDJun 27, 2025
Probing forced responses and causality in data-driven climate emulators: conceptual limitations and the role of reduced-order modelsFabrizio Falasca
A central challenge in climate science and applied mathematics is developing data-driven models of multiscale systems that capture both stationary statistics and responses to external perturbations. Current neural climate emulators aim to resolve the atmosphere-ocean system in all its complexity but often struggle to reproduce forced responses, limiting their use in causal studies such as Green's function experiments. To investigate the origin of these limitations, we first examine a simplified dynamical system that retains key features of climate variability. We interpret the results through linear response theory, providing a rigorous framework to evaluate neural models beyond stationary statistics and probe causal mechanisms. We argue that the ability of multiscale systems' emulators to reproduce perturbed statistics depends critically on (i) identifying an appropriate coarse-grained representation and (ii) careful parameterizations of unresolved processes. For low-frequency climate dynamics, these insights highlight reduced-order models, tailored to specific processes and scales, as valuable alternatives to general-purpose emulators. We next consider a real-world application, developing a neural model to investigate the joint variability of the surface temperature field and radiative fluxes. The model infers a multiplicative noise process directly from data, largely reproduces the system's probability distribution, and enables causal studies through forced responses. We discuss its limitations and outline directions for future work. These results expose key challenges in data-driven modeling of multiscale physical systems and underscore the value of coarse-grained, stochastic approaches, with response theory as a principled framework to guide model design.
AO-PHNov 5, 2019
Cumulo: A Dataset for Learning Cloud ClassesValentina Zantedeschi, Fabrizio Falasca, Alyson Douglas et al.
One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evaluating global cloud classification models. It consists of one year of 1km resolution MODIS hyperspectral imagery merged with pixel-width 'tracks' of CloudSat cloud labels. Bringing these complementary datasets together is a crucial first step, enabling the Machine-Learning community to develop innovative new techniques which could greatly benefit the Climate community. To showcase Cumulo, we provide baseline performance analysis using an invertible flow generative model (IResNet), which further allows us to discover new sub-classes for a given cloud class by exploring the latent space. To compare methods, we introduce a set of evaluation criteria, to identify models that are not only accurate, but also physically-realistic. CUMULO can be download from https://www.dropbox.com/sh/i3s9q2v2jjyk2it/AACxXnXfMF5wuIqLXqH4NJOra?dl=0 .