SDLGASAPDec 15, 2022

Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19

arXiv:2212.08571v25 citationsh-index: 27
AI Analysis

This work addresses limitations in data collection and performance assessment for COVID-19 prediction models, offering practical guidelines for researchers in public health and machine learning.

The paper rigorously assesses state-of-the-art machine learning techniques for predicting COVID-19 infection status from vocal audio signals, using a dataset from the UK Health Security Agency, and provides guidelines for performance testing and extension to public health datasets.

Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes