CVDec 26, 2023

Large-scale Long-tailed Disease Diagnosis on Radiology Images

Harvard
arXiv:2312.16151v330 citationsh-index: 20Nat Commun
Originality Incremental advance
AI Analysis

This work addresses the challenge of developing a generalist radiology diagnosis system to enhance clinical diagnostics, though it is incremental by building on existing transformer-based methods with new data.

The paper tackles the problem of large-scale long-tailed disease diagnosis from radiology images by introducing RadDiag, a foundational model that achieves 95.14% AUC on internal evaluation and demonstrates state-of-the-art results on external datasets.

Developing a generalist radiology diagnosis system can greatly enhance clinical diagnostics. In this paper, we introduce RadDiag, a foundational model supporting 2D and 3D inputs across various modalities and anatomies, using a transformer-based fusion module for comprehensive disease diagnosis. Due to patient privacy concerns and the lack of large-scale radiology diagnosis datasets, we utilize high-quality, clinician-reviewed radiological images available online with diagnosis labels. Our dataset, RP3D-DiagDS, contains 40,936 cases with 195,010 scans covering 5,568 disorders (930 unique ICD-10-CM codes). Experimentally, our RadDiag achieves 95.14% AUC on internal evaluation with the knowledge-enhancement strategy. Additionally, RadDiag can be zero-shot applied or fine-tuned to external diagnosis datasets sourced from various hospitals, demonstrating state-of-the-art results. In conclusion, we show that publicly shared medical data on the Internet is a tremendous and valuable resource that can potentially support building a generalist AI for healthcare.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes