CVDec 5, 2024

Mask of truth: model sensitivity to unexpected regions of medical images

Théo Sourget, Michelle Hestbek-Møller, Amelia Jiménez-Sánchez, Jack Junchi Xu, Veronika Cheplygina

arXiv:2412.04030v37.65 citationsh-index: 27Has CodeJournal of Imaging Informatics in Medicine

Originality Incremental advance

AI Analysis

This work addresses the problem of model reliability and spurious correlations in medical AI, which is critical for clinicians and patients, though it is incremental as it builds on existing concerns about model explainability.

The study investigated whether convolutional neural networks (CNNs) for medical image classification rely on clinically irrelevant parts of images, finding that models trained on chest X-rays and eye fundus images achieved above-random performance even when masking out regions of interest, with some models performing better on masked images than on those containing only the relevant regions.

The development of larger models for medical image analysis has led to increased performance. However, it also affected our ability to explain and validate model decisions. Models can use non-relevant parts of images, also called spurious correlations or shortcuts, to obtain high performance on benchmark datasets but fail in real-world scenarios. In this work, we challenge the capacity of convolutional neural networks (CNN) to classify chest X-rays and eye fundus images while masking out clinically relevant parts of the image. We show that all models trained on the PadChest dataset, irrespective of the masking strategy, are able to obtain an Area Under the Curve (AUC) above random. Moreover, the models trained on full images obtain good performance on images without the region of interest (ROI), even superior to the one obtained on images only containing the ROI. We also reveal a possible spurious correlation in the Chaksu dataset while the performances are more aligned with the expectation of an unbiased model. We go beyond the performance analysis with the usage of the explainability method SHAP and the analysis of embeddings. We asked a radiology resident to interpret chest X-rays under different masking to complement our findings with clinical knowledge. Our code is available at https://github.com/TheoSourget/MMC_Masking and https://github.com/TheoSourget/MMC_Masking_EyeFundus

View on arXiv PDF Code

Similar