LGCYAug 17, 2021

Incorporating Uncertainty in Learning to Defer Algorithms for Safe Computer-Aided Diagnosis

arXiv:2108.07392v525 citations
AI Analysis

This work addresses safety in medical diagnosis by reducing erroneous AI predictions for patients, though it is incremental as it builds on existing learning to defer methods.

The paper tackled the problem of erroneous diagnoses in computer-aided systems by proposing a learning to defer with uncertainty (LDU) algorithm that defers uncertain cases to human experts, achieving the same F1 score as baseline methods while reducing deferral rates, such as from 69% to 36% for pleural effusion diagnosis with an F1 of 0.96, and increasing F1 by 17% in cases with high-confidence errors.

Deep neural networks are increasingly being used for computer-aided diagnosis, but erroneous diagnoses can be extremely costly for patients. We propose a learning to defer with uncertainty (LDU) algorithm which identifies patients for whom diagnostic uncertainty is high and defers them for evaluation by human experts. LDU was evaluated on the diagnosis of myocardial infarction (using discharge summaries), the diagnosis of any comorbidities (using structured data), and the diagnosis of pleural effusion and pneumothorax (using chest x-rays), and compared with 'learning to defer without uncertainty information' (LD) and 'direct triage by uncertainty' (DT) methods. LDU achieved the same F1 score as LD but deferred considerably fewer patients (e.g. 36% vs. 69% deferral rate for diagnosing pleural effusion with an F1 score of 0.96). Furthermore, even when many patients were assigned the wrong diagnosis with high confidence (e.g. for the diagnosis of any comorbidities) LDU achieved a 17% increase in F1 score, whereas DT was not applicable. Importantly, the weight of the defer loss in LDU can be easily adjusted to obtain the desired trade-off between diagnostic accuracy and deferral rate. In conclusion, LDU can readily augment any existing diagnostic network to reduce the risk of erroneous diagnoses in clinical practice.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes