Salva Rühling Cachay

h-index29

10papers

311citations

Novelty55%

AI Score45

Ranked #69,771 of 201,326 authors (top 35%)#15,789 in LG (top 37%)

10 Papers

LGFeb 7, 2023Code

Climate Intervention Analysis using AI Model Guided by Statistical Physics Principles

Soo Kyung Kim, Kalai Ramea, Salva Rühling Cachay et al.

The availability of training data remains a significant obstacle for the implementation of machine learning in scientific applications. In particular, estimating how a system might respond to external forcings or perturbations requires specialized labeled data or targeted simulations, which may be computationally intensive to generate at scale. In this study, we propose a novel solution to this challenge by utilizing a principle from statistical physics known as the Fluctuation-Dissipation Theorem (FDT) to discover knowledge using an AI model that can rapidly produce scenarios for different external forcings. By leveraging FDT, we are able to extract information encoded in a large dataset produced by Earth System Models, which includes 8250 years of internal climate fluctuations, to estimate the climate system's response to forcings. Our model, AiBEDO, is capable of capturing the complex, multi-timescale effects of radiation perturbations on global and regional surface climate, allowing for a substantial acceleration of the exploration of the impacts of spatially-heterogenous climate forcers. To demonstrate the utility of AiBEDO, we use the example of a climate intervention technique called Marine Cloud Brightening, with the ultimate goal of optimizing the spatial pattern of cloud brightening to achieve regional climate targets and prevent known climate tipping points. While we showcase the effectiveness of our approach in the context of climate science, it is generally applicable to other scientific disciplines that are limited by the extensive computational demands of domain simulation models. Source code of AiBEDO framework is made available at https://github.com/kramea/kdd_aibedo. A sample dataset is made available at https://doi.org/10.5281/zenodo.7597027. Additional data available upon request.

LGJun 3, 2023

DYffusion: A Dynamics-informed Diffusion Model for Spatiotemporal Forecasting

Salva Rühling Cachay, Bo Zhao, Hailey Joren et al.

While diffusion models can successfully generate data and make predictions, they are predominantly designed for static images. We propose an approach for efficiently training diffusion models for probabilistic spatiotemporal forecasting, where generating stable and accurate rollout forecasts remains challenging, Our method, DYffusion, leverages the temporal dynamics in the data, directly coupling it with the diffusion steps in the model. We train a stochastic, time-conditioned interpolator and a forecaster network that mimic the forward and reverse processes of standard diffusion models, respectively. DYffusion naturally facilitates multi-step and long-range forecasting, allowing for highly flexible, continuous-time sampling trajectories and the ability to trade-off performance with accelerated sampling at inference time. In addition, the dynamics-informed diffusion process in DYffusion imposes a strong inductive bias and significantly improves computational efficiency compared to traditional Gaussian noise-based diffusion models. Our approach performs competitively on probabilistic forecasting of complex dynamics in sea surface temperatures, Navier-Stokes flows, and spring mesh systems.

LGJun 24, 2025Code

Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting

Salva Rühling Cachay, Miika Aittala, Karsten Kreis et al.

Diffusion models are a powerful tool for probabilistic forecasting, yet most applications in high-dimensional chaotic systems predict future snapshots one-by-one. This common approach struggles to model complex temporal dependencies and fails to explicitly account for the progressive growth of uncertainty inherent to such systems. While rolling diffusion frameworks, which apply increasing noise to forecasts at longer lead times, have been proposed to address this, their integration with state-of-the-art, high-fidelity diffusion techniques remains a significant challenge. We tackle this problem by introducing Elucidated Rolling Diffusion Models (ERDM), the first framework to successfully unify a rolling forecast structure with the principled, performant design of Elucidated Diffusion Models (EDM). To do this, we adapt the core EDM components-its noise schedule, network preconditioning, and Heun sampler-to the rolling forecast setting. The success of this integration is driven by three key contributions: (i) a novel loss weighting scheme that focuses model capacity on the mid-range forecast horizons where determinism gives way to stochasticity; (ii) an efficient initialization strategy using a pre-trained EDM for the initial window; and (iii) a bespoke hybrid sequence architecture for robust spatiotemporal feature extraction under progressive denoising. On 2D Navier-Stokes simulations and ERA5 global weather forecasting at 1.5^\circ resolution, ERDM consistently outperforms key diffusion-based baselines, including conditional autoregressive EDM. ERDM offers a flexible and powerful general framework for tackling diffusion-based sequence generation problems where modeling escalating uncertainty is paramount. Code is available at: https://github.com/salvaRC/erdm

LGNov 29, 2021Code

ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models

Salva Rühling Cachay, Venkatesh Ramesh, Jason N. S. Cole et al.

Numerical simulations of Earth's weather and climate require substantial amounts of computation. This has led to a growing interest in replacing subroutines that explicitly compute physical processes with approximate machine learning (ML) methods that are fast at inference time. Within weather and climate models, atmospheric radiative transfer (RT) calculations are especially expensive. This has made them a popular target for neural network-based emulators. However, prior work is hard to compare due to the lack of a comprehensive dataset and standardized best practices for ML benchmarking. To fill this gap, we build a large dataset, ClimART, with more than \emph{10 million samples from present, pre-industrial, and future climate conditions}, based on the Canadian Earth System Model. ClimART poses several methodological challenges for the ML community, such as multiple out-of-distribution test sets, underlying domain physics, and a trade-off between accuracy and inference speed. We also present several novel baselines that indicate shortcomings of datasets and network architectures used in prior work. Download instructions, baselines, and code are available at: https://github.com/RolnickLab/climart

AIOct 5, 2025

Zephyrus: An Agentic Framework for Weather Science

Sumanth Varambally, Marshall Fisher, Jas Thakker et al.

Foundation models for weather science are pre-trained on vast amounts of structured numerical data and outperform traditional weather forecasting systems. However, these models lack language-based reasoning capabilities, limiting their utility in interactive scientific workflows. Large language models (LLMs) excel at understanding and generating text but cannot reason about high-dimensional meteorological datasets. We bridge this gap by building a novel agentic framework for weather science. Our framework includes a Python code-based environment for agents (ZephyrusWorld) to interact with weather data, featuring tools like an interface to WeatherBench 2 dataset, geoquerying for geographical masks from natural language, weather forecasting, and climate simulation capabilities. We design Zephyrus, a multi-turn LLM-based weather agent that iteratively analyzes weather datasets, observes results, and refines its approach through conversational feedback loops. We accompany the agent with a new benchmark, ZephyrusBench, with a scalable data generation pipeline that constructs diverse question-answer pairs across weather-related tasks, from basic lookups to advanced forecasting, extreme event detection, and counterfactual reasoning. Experiments on this benchmark demonstrate the strong performance of Zephyrus agents over text-only baselines, outperforming them by up to 35 percentage points in correctness. However, on harder tasks, Zephyrus performs similarly to text-only baselines, highlighting the challenging nature of our benchmark and suggesting promising directions for future work.

LGJun 21, 2024

Probabilistic Emulation of a Global Climate Model with Spherical DYffusion

Salva Rühling Cachay, Brian Henn, Oliver Watt-Meyer et al.

Data-driven deep learning models are transforming global weather forecasting. It is an open question if this success can extend to climate modeling, where the complexity of the data and long inference rollouts pose significant challenges. Here, we present the first conditional generative model that produces accurate and physically consistent global climate ensemble simulations by emulating a coarse version of the United States' primary operational global forecast model, FV3GFS. Our model integrates the dynamics-informed diffusion framework (DYffusion) with the Spherical Fourier Neural Operator (SFNO) architecture, enabling stable 100-year simulations at 6-hourly timesteps while maintaining low computational overhead compared to single-step deterministic baselines. The model achieves near gold-standard performance for climate model emulation, outperforming existing approaches and demonstrating promising ensemble skill. This work represents a significant advance towards efficient, data-driven climate simulations that can enhance our understanding of the climate system and inform adaptation strategies.

LGJul 5, 2021

End-to-End Weak Supervision

Salva Rühling Cachay, Benedikt Boecking, Artur Dubrawski

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources -- making assumptions that rarely hold in practice -- followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing previous probabilistic posteriors with a neural network. Our results show improved performance over prior work in terms of end model performance on downstream test sets, as well as in terms of improved robustness to dependencies among weak supervision sources.

LGJun 18, 2021

Dependency Structure Misspecification in Multi-Source Weak Supervision Models

Salva Rühling Cachay, Benedikt Boecking, Artur Dubrawski

Data programming (DP) has proven to be an attractive alternative to costly hand-labeling of data. In DP, users encode domain knowledge into \emph{labeling functions} (LF), heuristics that label a subset of the data noisily and may have complex dependencies. A label model is then fit to the LFs to produce an estimate of the unknown class label. The effects of label model misspecification on test set performance of a downstream classifier are understudied. This presents a serious awareness gap to practitioners, in particular since the dependency structure among LFs is frequently ignored in field applications of DP. We analyse modeling errors due to structure over-specification. We derive novel theoretical bounds on the modeling error and empirically show that this error can be substantial, even when modeling a seemingly sensible structure.

LGApr 11, 2021

The World as a Graph: Improving El Niño Forecasts with Graph Neural Networks

Salva Rühling Cachay, Emma Erickson, Arthur Fender C. Bucker et al.

Deep learning-based models have recently outperformed state-of-the-art seasonal forecasting models, such as for predicting El Niño-Southern Oscillation (ENSO). However, current deep learning models are based on convolutional neural networks which are difficult to interpret and can fail to model large-scale atmospheric patterns. In comparison, graph neural networks (GNNs) are capable of modeling large-scale spatial dependencies and are more interpretable due to the explicit modeling of information flow through edge connections. We propose the first application of graph neural networks to seasonal forecasting. We design a novel graph connectivity learning module that enables our GNN model to learn large-scale spatial interactions jointly with the actual ENSO forecasting task. Our model, \graphino, outperforms state-of-the-art deep learning-based models for forecasts up to six months ahead. Additionally, we show that our model is more interpretable as it learns sensible connectivity structures that correlate with the ENSO anomaly pattern.

LGDec 2, 2020

Graph Neural Networks for Improved El Niño Forecasting

Salva Rühling Cachay, Emma Erickson, Arthur Fender C. Bucker et al.

Deep learning-based models have recently outperformed state-of-the-art seasonal forecasting models, such as for predicting El Niño-Southern Oscillation (ENSO). However, current deep learning models are based on convolutional neural networks which are difficult to interpret and can fail to model large-scale atmospheric patterns called teleconnections. Hence, we propose the application of spatiotemporal Graph Neural Networks (GNN) to forecast ENSO at long lead times, finer granularity and improved predictive skill than current state-of-the-art methods. The explicit modeling of information flow via edges may also allow for more interpretable forecasts. Preliminary results are promising and outperform state-of-the art systems for projections 1 and 3 months ahead.