IVCVLGDec 22, 2020

Objective Evaluation of Deep Uncertainty Predictions for COVID-19 Detection

arXiv:2012.11840v179 citations
AI Analysis

This work addresses the problem of quantifying predictive confidence in deep neural networks for medical image analysis, which is crucial for trusted deployment in clinical settings.

The paper evaluates three uncertainty quantification techniques for COVID-19 detection using chest X-Ray images. It finds that networks pretrained on CXR images outperform those pretrained on natural image datasets, and that predictive uncertainty estimates are statistically higher for erroneous predictions, allowing for flagging risky predictions.

Deep neural networks (DNNs) have been widely applied for detecting COVID-19 in medical images. Existing studies mainly apply transfer learning and other data representation strategies to generate accurate point estimates. The generalization power of these networks is always questionable due to being developed using small datasets and failing to report their predictive confidence. Quantifying uncertainties associated with DNN predictions is a prerequisite for their trusted deployment in medical settings. Here we apply and evaluate three uncertainty quantification techniques for COVID-19 detection using chest X-Ray (CXR) images. The novel concept of uncertainty confusion matrix is proposed and new performance metrics for the objective evaluation of uncertainty estimates are introduced. Through comprehensive experiments, it is shown that networks pertained on CXR images outperform networks pretrained on natural image datasets such as ImageNet. Qualitatively and quantitatively evaluations also reveal that the predictive uncertainty estimates are statistically higher for erroneous predictions than correct predictions. Accordingly, uncertainty quantification methods are capable of flagging risky predictions with high uncertainty estimates. We also observe that ensemble methods more reliably capture uncertainties during the inference.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes