MED-PHAIDec 11, 2017

MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs

arXiv:1712.06957v488 citations
Originality Synthesis-oriented
AI Analysis

This provides a benchmark dataset for medical imaging research, though it is incremental as it applies existing methods to a new dataset.

The authors introduced MURA, a large dataset of 40,561 musculoskeletal radiographs labeled for abnormality detection, and trained a DenseNet model that achieved an AUROC of 0.929, with performance comparable to radiologists on finger and wrist studies but lower on other body parts.

We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,561 images from 14,863 studies, where each study is manually labeled by radiologists as either normal or abnormal. To evaluate models robustly and to get an estimate of radiologist performance, we collect additional labels from six board-certified Stanford radiologists on the test set, consisting of 207 musculoskeletal studies. On this test set, the majority vote of a group of three radiologists serves as gold standard. We train a 169-layer DenseNet baseline model to detect and localize abnormalities. Our model achieves an AUROC of 0.929, with an operating point of 0.815 sensitivity and 0.887 specificity. We compare our model and radiologists on the Cohen's kappa statistic, which expresses the agreement of our model and of each radiologist with the gold standard. Model performance is comparable to the best radiologist performance in detecting abnormalities on finger and wrist studies. However, model performance is lower than best radiologist performance in detecting abnormalities on elbow, forearm, hand, humerus, and shoulder studies. We believe that the task is a good challenge for future research. To encourage advances, we have made our dataset freely available at https://stanfordmlgroup.github.io/competitions/mura .

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes