CV LGDec 2, 2021

CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Moein Sorkhei, Yue Liu, Hossein Azizpour, Edward Azavedo, Karin Dembrower, Dimitra Ntoula, Athanasios Zouzos, Fredrik Strand, Kevin Smith

arXiv:2112.01330v28.718 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of improving early detection of aggressive breast cancers for screening participants, though it is incremental as it builds on existing masking research with a new dataset.

The authors tackled the problem of missed breast cancer detection in mammograms due to masking by introducing CSAW-M, a large public dataset with direct annotations from specialists, and showed that deep learning models trained on it predict interval and large invasive cancers more effectively than breast density measures.

Interval and large invasive breast cancers, which are associated with worse prognosis than other cancers, are usually detected at a late stage due to false negative assessments of screening mammograms. The missed screening-time detection is commonly caused by the tumor being obscured by its surrounding breast tissues, a phenomenon called masking. To study and benchmark mammographic masking of cancer, in this work we introduce CSAW-M, the largest public mammographic dataset, collected from over 10,000 individuals and annotated with potential masking. In contrast to the previous approaches which measure breast image density as a proxy, our dataset directly provides annotations of masking potential assessments from five specialists. We also trained deep learning models on CSAW-M to estimate the masking level and showed that the estimated masking is significantly more predictive of screening participants diagnosed with interval and large invasive cancers -- without being explicitly trained for these tasks -- than its breast density counterparts.

View on arXiv PDF Code

Similar