Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning
This work addresses the challenge of pediatric tuberculosis screening, which is critical for global health but suffers from data scarcity and subjective interpretation, by enabling zero-shot detection using adult data, though it is incremental as it builds on existing self-supervised and ViT methods.
The paper tackled the problem of detecting tuberculosis in chest X-rays, particularly in pediatric cases where data is scarce, by proposing a self-supervised learning approach using Vision Transformers, resulting in performance gains of up to 13.4% in AUC/AUPR and achieving top AUCs of 0.959 for adults and 0.697 for zero-shot pediatric detection.
Tuberculosis (TB) remains a significant global health challenge, with pediatric cases posing a major concern. The World Health Organization (WHO) advocates for chest X-rays (CXRs) for TB screening. However, visual interpretation by radiologists can be subjective, time-consuming and prone to error, especially in pediatric TB. Artificial intelligence (AI)-driven computer-aided detection (CAD) tools, especially those utilizing deep learning, show promise in enhancing lung disease detection. However, challenges include data scarcity and lack of generalizability. In this context, we propose a novel self-supervised paradigm leveraging Vision Transformers (ViT) for improved TB detection in CXR, enabling zero-shot pediatric TB detection. We demonstrate improvements in TB detection performance ($\sim$12.7% and $\sim$13.4% top AUC/AUPR gains in adults and children, respectively) when conducting self-supervised pre-training when compared to fully-supervised (i.e., non pre-trained) ViT models, achieving top performances of 0.959 AUC and 0.962 AUPR in adult TB detection, and 0.697 AUC and 0.607 AUPR in zero-shot pediatric TB detection. As a result, this work demonstrates that self-supervised learning on adult CXRs effectively extends to challenging downstream tasks such as pediatric TB detection, where data are scarce.