Impact of lung segmentation on the diagnosis and explanation of COVID-19 in chest X-ray images
This work addresses the challenge of improving COVID-19 diagnosis accuracy and explainability in medical imaging, though it is incremental as it applies existing segmentation and classification methods to this specific problem.
The study investigated the effect of lung segmentation on diagnosing COVID-19 from chest X-ray images, finding that segmentation improved classification with an F1-Score of 0.88 for multi-class and 0.83 for COVID-19 identification, but cross-dataset generalization remained limited with an F1-Score of 0.74 due to biases from different data sources.
COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources.