Marisa Eisenberg

LG
h-index2
4papers
13citations
Novelty63%
AI Score52

4 Papers

18.6LGJun 4
Benchmarking Counterfactual Prediction in Epidemic Time Series with Time-Varying Interventions

Wenhao Mu, Facundo Yan, Anik Mumssen et al.

Deep learning has enabled significant advances in time-series causal inference, yet progress remains constrained by the lack of realistic benchmarks with observable counterfactual outcomes. Existing datasets either rely on real-world observations without ground-truth counterfactuals or on simplified simulations that fail to capture complex causal dynamics. To address this gap, we develop a large-scale benchmark for counterfactual prediction in epidemic time series under dynamic interventions. Unlike existing benchmarks, it supports static and time-varying treatments, as well as both single-policy and multi-policy intervention settings, enabling evaluation of causal inference methods across a broad range of causal inference scenarios. Leveraging a calibrated agent-based model grounded in real-world demographic, mobility, epidemiological, and policy data, we generate realistic counterfactual trajectories across more than 150 U.S. counties. Using this benchmark, we evaluate widely used and state-of-the-art causal inference methods, revealing substantial performance differences and highlighting the challenges of realistic time-series causal reasoning.

LGJul 11, 2025
Simulation as Supervision: Mechanistic Pretraining for Scientific Discovery

Carson Dudley, Reiden Magdaleno, Christopher Harding et al.

Scientific modeling faces a tradeoff: mechanistic models provide scientific grounding but struggle with real-world complexity, while machine learning models achieve strong predictive performance but require large labeled datasets and are not interpretable. We introduce Simulation-Grounded Neural Networks (SGNNs), which use mechanistic simulations as training data for neural networks. SGNNs are pretrained on synthetic corpora spanning diverse model structures, parameter regimes, stochasticity, and observational artifacts. Simulation-grounded learning has been applied in multiple domains (e.g., surrogate models in physics, forecasting in epidemiology). We provide a unified framework for simulation-grounded learning and evaluated SGNNs across scientific disciplines and modeling tasks. We found that SGNNs were successful across domains: for prediction tasks, they nearly tripled COVID-19 forecasting skill versus CDC baselines, reduced chemical yield prediction error by one-third, and maintained accuracy in ecological forecasting where task-specific models failed. For inference tasks, SGNNs also accurately classified the source of information spread in simulated social networks and enabled supervised learning for unobservable targets, such as estimating COVID-19 transmissibility more accurately than traditional methods even in early outbreaks. Finally, SGNNs enable back-to-simulation attribution, a form of mechanistic interpretability. Back-to-simulation attribution matches real-world observations to the training simulations the model considers most similar, identifying which mechanistic processes the model believes best explain the observed data. By providing a unified framework for simulation-grounded learning, we establish when and how mechanistic simulations can serve as effective training data for robust, interpretable scientific inference.

AIAug 17, 2025
Mantis: A Simulation-Grounded Foundation Model for Disease Forecasting

Carson Dudley, Reiden Magdaleno, Christopher Harding et al.

Infectious disease forecasting in novel outbreaks or low-resource settings is hampered by the need for disease-specific data, bespoke training, and expert tuning. We introduce Mantis, a foundation model trained entirely on mechanistic simulations, which enables out-of-the-box forecasting across diseases, regions, and outcomes, even in settings with limited historical data. We evaluated Mantis against 48 forecasting models across six diseases with diverse transmission modes, assessing both point forecast accuracy (mean absolute error) and probabilistic performance (weighted interval score and coverage). Despite using no real-world data during training, Mantis achieved lower mean absolute error than all models in the CDC's COVID-19 Forecast Hub when backtested on early pandemic forecasts. Across all other diseases tested, including respiratory, vector-borne, and waterborne pathogens, Mantis consistently ranked in the top two models across all evaluation metrics. Notably, Mantis generalized to diseases with transmission mechanisms not represented in its training data, demonstrating that it captures fundamental contagion dynamics rather than memorizing disease-specific patterns. These capabilities position Mantis as a practical foundation for disease forecasting: general-purpose, accurate, and deployable where traditional models fail.

LGSep 23, 2025
Learning From Simulators: A Theory of Simulation-Grounded Learning

Carson Dudley, Marisa Eisenberg

Simulation-Grounded Neural Networks (SGNNs) are predictive models trained entirely on synthetic data from mechanistic simulations. They have achieved state-of-the-art performance in domains where real-world labels are limited or unobserved, but lack a formal underpinning. We place SGNNs in a unified statistical framework. Under standard loss functions, they can be interpreted as amortized Bayesian predictors trained under a simulator-induced prior. Empirical risk minimization then yields convergence to the Bayes-optimal predictor under the synthetic distribution. We employ classical results on distribution shift to characterize how performance degrades when the simulator diverges from reality. Beyond these consequences, we develop SGNN-specific results: (i) conditions under which unobserved scientific parameters are learnable via simulation, and (ii) a back-to-simulation attribution method that provides mechanistic explanations of predictions by linking them to the simulations the model deems similar, with guarantees of posterior consistency. We provide numerical experiments to validate theoretical predictions. SGNNs recover latent parameters, remain robust under mismatch, and outperform classical tools: in a model selection task, SGNNs achieve half the error of AIC in distinguishing mechanistic dynamics. These results establish SGNNs as a principled and practical framework for scientific prediction in data-limited regimes.