EpiEvolve: Self-Evolving Agents for Streaming Pandemic Forecasting under Regime Shifts
For operational epidemic forecasting, this work addresses the mismatch between static training and streaming deployment with regime shifts, offering a practical self-evolving agent that improves accuracy and adaptation speed.
EpiEvolve adapts a static LLM forecaster to streaming pandemic forecasting under regime shifts, achieving 0.629 average accuracy vs. 0.561 for the static backbone and reducing recovery lag from 5 to 2 weeks.
Epidemic LLM forecasters are usually trained and evaluated as static supervised models, whereas operational pandemic forecasting is a streaming process in which labels arrive after predictions and disease regimes shift over time. We study this mismatch in weekly COVID-19 hospitalization trend forecasting across five variant regimes. We introduce EpiEvolve, a self-evolving agent that wraps an LLM forecaster trained on the warm-start period and keeps its weights fixed during streaming. EpiEvolve adapts by storing forecast outcomes in a hierarchical episodic memory, reflecting on delayed labels, retrieving cases relevant to the current regime, and distilling recurring errors into strategic rules. The resulting context lets the forecaster reuse its own past predictions and outcomes in later weeks while following a chronological protocol that prevents future leakage. On the streaming dataset, EpiEvolve reaches $0.629$ average accuracy, compared with $0.561$ for the static backbone and $0.325$ for the external CDC ensemble, and reduces recovery lag after regime shifts from $5$ to $2$ weeks. Ablations show that reflection, strategic memory, and regime-aware retrieval each contribute to the gains.