Jatan Buch

LG
h-index7
4papers
13citations
Novelty49%
AI Score29

4 Papers

LGMay 20, 2025Code
Deep Koopman operator framework for causal discovery in nonlinear dynamical systems

Juan Nathaniel, Carla Roesch, Jatan Buch et al.

We use a deep Koopman operator-theoretic formalism to develop a novel causal discovery algorithm, Kausal. Causal discovery aims to identify cause-effect mechanisms for better scientific understanding, explainable decision-making, and more accurate modeling. Standard statistical frameworks, such as Granger causality, lack the ability to quantify causal relationships in nonlinear dynamics due to the presence of complex feedback mechanisms, timescale mixing, and nonstationarity. This presents a challenge in studying many real-world systems, such as the Earth's climate. Meanwhile, Koopman operator methods have emerged as a promising tool for approximating nonlinear dynamics in a linear space of observables. In Kausal, we propose to leverage this powerful idea for causal analysis where optimal observables are inferred using deep learning. Causal estimates are then evaluated in a reproducing kernel Hilbert space, and defined as the distance between the marginal dynamics of the effect and the joint dynamics of the cause-effect observables. Our numerical experiments demonstrate Kausal's superior ability in discovering and characterizing causal signals compared to existing approaches of prescribed observables. Lastly, we extend our analysis to observations of El Niño-Southern Oscillation highlighting our algorithm's applicability to real-world phenomena. Our code is available at https://github.com/juannat7/kausal.

MLMar 11, 2024
Efficient first-order algorithms for large-scale, non-smooth maximum entropy models with application to wildfire science

Gabriel P. Langlois, Jatan Buch, Jérôme Darbon

Maximum entropy (Maxent) models are a class of statistical models that use the maximum entropy principle to estimate probability distributions from data. Due to the size of modern data sets, Maxent models need efficient optimization algorithms to scale well for big data applications. State-of-the-art algorithms for Maxent models, however, were not originally designed to handle big data sets; these algorithms either rely on technical devices that may yield unreliable numerical results, scale poorly, or require smoothness assumptions that many practical Maxent models lack. In this paper, we present novel optimization algorithms that overcome the shortcomings of state-of-the-art algorithms for training large-scale, non-smooth Maxent models. Our proposed first-order algorithms leverage the Kullback-Leibler divergence to train large-scale and non-smooth Maxent models efficiently. For Maxent models with discrete probability distribution of $n$ elements built from samples, each containing $m$ features, the stepsize parameters estimation and iterations in our algorithms scale on the order of $O(mn)$ operations and can be trivially parallelized. Moreover, the strong $\ell_{1}$ convexity of the Kullback--Leibler divergence allows for larger stepsize parameters, thereby speeding up the convergence rate of our algorithms. To illustrate the efficiency of our novel algorithms, we consider the problem of estimating probabilities of fire occurrences as a function of ecological features in the Western US MTBS-Interagency wildfire data set. Our numerical results show that our algorithms outperform the state of the arts by one order of magnitude and yield results that agree with physical models of wildfire occurrence and previous statistical analyses of wildfire drivers.

LGMay 22, 2025
CausalDynamics: A large-scale benchmark for structural discovery of dynamical causal models

Benjamin Herdeanu, Juan Nathaniel, Carla Roesch et al.

Causal discovery for dynamical systems poses a major challenge in fields where active interventions are infeasible. Most methods used to investigate these systems and their associated benchmarks are tailored to deterministic, low-dimensional and weakly nonlinear time-series data. To address these limitations, we present CausalDynamics, a large-scale benchmark and extensible data generation framework to advance the structural discovery of dynamical causal models. Our benchmark consists of true causal graphs derived from thousands of both linearly and nonlinearly coupled ordinary and stochastic differential equations as well as two idealized climate models. We perform a comprehensive evaluation of state-of-the-art causal discovery algorithms for graph reconstruction on systems with noisy, confounded, and lagged dynamics. CausalDynamics consists of a plug-and-play, build-your-own coupling workflow that enables the construction of a hierarchy of physical systems. We anticipate that our framework will facilitate the development of robust causal discovery algorithms that are broadly applicable across domains while addressing their unique challenges. We provide a user-friendly implementation and documentation on https://kausable.github.io/CausalDynamics.

AO-PHDec 7, 2023
Simulating the Air Quality Impact of Prescribed Fires Using Graph Neural Network-Based PM$_{2.5}$ Forecasts

Kyleen Liao, Jatan Buch, Kara Lamb et al.

The increasing size and severity of wildfires across the western United States have generated dangerous levels of PM$_{2.5}$ concentrations in recent years. In a changing climate, expanding the use of prescribed fires is widely considered to be the most robust fire mitigation strategy. However, reliably forecasting the potential air quality impact from prescribed fires, which is critical in planning the prescribed fires' location and time, at hourly to daily time scales remains a challenging problem. In this paper, we introduce a spatial-temporal graph neural network (GNN) based forecasting model for hourly PM$_{2.5}$ predictions across California. Using a two-step approach, we leverage our forecasting model to estimate the PM$_{2.5}$ contribution of wildfires. Integrating the GNN-based PM$_{2.5}$ forecasting model with prescribed fire simulations, we propose a novel framework to forecast the PM$_{2.5}$ pollution of prescribed fires. This framework helps determine March as the optimal month for implementing prescribed fires in California and quantifies the potential air quality trade-offs involved in conducting more prescribed fires outside the fire season.