Detecting Shortcuts in Medical Images -- A Case Study in Chest X-rays
This work addresses data quality issues in medical imaging for researchers and practitioners, but it is incremental as it builds on existing findings.
The study tackled the problem of shortcuts and artifacts in medical image data, particularly in chest X-rays, by validating previous findings and providing annotations for pneumothorax images with drains, concluding with recommendations for classification.
The availability of large public datasets and the increased amount of computing power have shifted the interest of the medical community to high-performance algorithms. However, little attention is paid to the quality of the data and their annotations. High performance on benchmark datasets may be reported without considering possible shortcuts or artifacts in the data, besides, models are not tested on subpopulation groups. With this work, we aim to raise awareness about shortcuts problems. We validate previous findings, and present a case study on chest X-rays using two publicly available datasets. We share annotations for a subset of pneumothorax images with drains. We conclude with general recommendations for medical image classification.