Tuberculosis Screening from Cough Audio: Baseline Models, Clinical Variables, and Uncertainty Quantification
This work addresses the problem of inconsistent evaluation in TB audio screening for researchers and clinicians, but it is incremental as it focuses on standardization rather than new breakthroughs.
The authors tackled the lack of comparability in tuberculosis screening from cough audio by establishing a standardized, reproducible baseline framework using machine learning, which includes multimodal fusion and uncertainty quantification, though no specific performance numbers are provided.
In this paper, we propose a standardized framework for automatic tuberculosis (TB) detection from cough audio and routinely collected clinical data using machine learning. While TB screening from audio has attracted growing interest, progress is difficult to measure because existing studies vary substantially in datasets, cohort definitions, feature representations, model families, validation protocols, and reported metrics. Consequently, reported gains are often not directly comparable, and it remains unclear whether improvements stem from modeling advances or from differences in data and evaluation. We address this gap by establishing a strong, well-documented baseline for TB prediction using cough recordings and accompanying clinical metadata from a recently compiled dataset from several countries. Our pipeline is reproducible end-to-end, covering feature extraction, multimodal fusion, cougher-independent evaluation, and uncertainty quantification, and it reports a consistent suite of clinically relevant metrics to enable fair comparison. We further quantify performance for cough audio-only and fused (audio + clinical metadata) models, and release the full experimental protocol to facilitate benchmarking. This baseline is intended to serve as a common reference point and to reduce methodological variance that currently holds back progress in the field.