IV CV LGFeb 23, 2021

VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels

Saahil Jain, Akshay Smit, Steven QH Truong, Chanh DT Nguyen, Minh-Thanh Huynh, Mudit Jain, Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, Pranav Rajpurkar

arXiv:2102.11467v216.440 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of unreliable report labels for training medical image analysis models, which is incremental as it builds on existing NLP and computer vision techniques.

The paper tackles the discrepancy between radiology report labels and image labels by developing VisualCheXbert, a method that maps reports to image labels using a biomedically-pretrained BERT model supervised by a computer vision model, resulting in an average F1 score improvement of 0.14 over existing methods and better agreement with radiologists by up to 0.21 F1 score.

Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods to produce labels from radiology reports that have better agreement with radiologists labeling images. Our best performing method, called VisualCheXbert, uses a biomedically-pretrained BERT model to directly map from a radiology report to the image labels, with a supervisory signal determined by a computer vision model trained to detect medical conditions from chest X-ray images. We find that VisualCheXbert outperforms an approach using an existing radiology report labeler by an average F1 score of 0.14 (95% CI 0.12, 0.17). We also find that VisualCheXbert better agrees with radiologists labeling chest X-ray images than do radiologists labeling the corresponding radiology reports by an average F1 score across several medical conditions of between 0.12 (95% CI 0.09, 0.15) and 0.21 (95% CI 0.18, 0.24).

View on arXiv PDF Code

Similar