Supervised learning improves disease outbreak detection
This work addresses the need for more accurate outbreak detection in public health surveillance systems, representing an incremental improvement over existing statistical tools.
The authors tackled the problem of early infectious disease outbreak detection by developing the first supervised learning approach based on hidden Markov models, which reduced the false positive rate by up to 50% while maintaining sensitivity compared to a state-of-the-art method used in multiple European countries.
The early detection of infectious disease outbreaks is a crucial task to protect population health. To this end, public health surveillance systems have been established to systematically collect and analyse infectious disease data. A variety of statistical tools are available, which detect potential outbreaks as abberations from an expected endemic level using these data. Here, we develop the first supervised learning approach based on hidden Markov models for disease outbreak detection, which leverages data that is routinely collected within a public health surveillance system. We evaluate our model using real Salmonella and Campylobacter data, as well as simulations. In comparison to a state-of-the-art approach, which is applied in multiple European countries including Germany, our proposed model reduces the false positive rate by up to 50% while retaining the same sensitivity. We see our supervised learning approach as a significant step to further develop machine learning applications for disease outbreak detection, which will be instrumental to improve public health surveillance systems.