CL CVOct 1, 2019

Learning to estimate label uncertainty for automatic radiology report parsing

arXiv:1910.00673v10.31 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses the issue of statistical inefficiency in medical image models due to hard labels, offering a scalable solution for radiology report parsing with uncertainty quantification.

The paper tackled the problem of binary labels from rule-based radiology report parsers forcing image models to express unwarranted certainty, by training a Bidirectional LSTM to augment these labels, achieving comparable or better performance than domain-specific NLP while providing uncertainty estimates.

Bootstrapping labels from radiology reports has become the scalable alternative to provide inexpensive ground truth for medical imaging. Because of the domain specific nature, state-of-the-art report labeling tools are predominantly rule-based. These tools, however, typically yield a binary 0 or 1 prediction that indicates the presence or absence of abnormalities. These hard targets are then used as ground truth to train image models in the downstream, forcing models to express high degree of certainty even on cases where specificity is low. This could negatively impact the statistical efficiency of image models. We address such an issue by training a Bidirectional Long-Short Term Memory Network to augment heuristic-based discrete labels of X-ray reports from all body regions and achieve performance comparable or better than domain-specific NLP, but with additional uncertainty estimates which enable finer downstream image model training.

View on arXiv PDF

Similar