Ensemble Machine Learning Methods for Modeling COVID19 Deaths
This provides a data-driven tool for public health officials to forecast COVID-19 deaths with uncertainty estimates, though it is incremental as it builds on existing modeling approaches.
The authors tackled the problem of predicting COVID-19 deaths at the county level in the US by developing a hybrid machine learning and epidemiological model that outputs quantile estimates, winning a Caltech competition out of over 50 teams and achieving competitive root mean squared error with top systems.
Using a hybrid of machine learning and epidemiological approaches, we propose a novel data-driven approach in predicting US COVID-19 deaths at a county level. The model gives a more complete description of the daily death distribution, outputting quantile-estimates instead of mean deaths, where the model's objective is to minimize the pinball loss on deaths reported by the New York Times coronavirus county dataset. The resulting quantile estimates accurately forecast deaths at an individual-county level for a variable-length forecast period, and the approach generalizes well across different forecast period lengths. We won the Caltech-run modeling competition out of 50+ teams, and our aggregate is competitive with the best COVID-19 modeling systems (on root mean squared error).