Cross-Country Learning for National Infectious Disease Forecasting Using European Data
This addresses the challenge of accurate public health forecasting for countries with sparse historical data, though it is incremental as it extends existing cross-country learning methods to infectious diseases.
The authors tackled the problem of limited national data for infectious disease forecasting by training a single model on time series from multiple European countries, applied to COVID-19 case forecasting in Cyprus, resulting in consistent improvements over models using only national data.
Accurate forecasting of infectious disease incidence is critical for public health planning and timely intervention. While most data-driven forecasting approaches rely primarily on historical data from a single country, such data are often limited in length and variability, restricting the performance of machine learning (ML) models. In this work, we investigate a cross-country learning approach for infectious disease forecasting, in which a single model is trained on time series data from multiple countries and evaluated on a country of interest. This setting enables the model to exploit shared epidemic dynamics across countries and to benefit from an enlarged training set. We examine this approach through a case study on COVID-19 case forecasting in Cyprus, using surveillance data from European countries. We evaluate multiple ML models and analyse the impact of the lookback window length and cross-country `data augmentation' on multi-step forecasting performance. Our results show that incorporating data from other countries can lead to consistent improvements over models trained solely on national data. Although the empirical focus is on Cyprus and COVID-19, the proposed framework and findings are applicable to infectious disease forecasting more broadly, particularly in settings with limited national historical data.