CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling
This addresses the problem of scalable and accurate medical report labeling for radiology applications, though it is incremental by building on existing language models.
The study tackled the challenge of labeling free-text chest X-ray reports by developing CheX-GPT, a BERT-based model trained on GPT-labeled data, which achieved higher accuracy and efficiency than existing models, as benchmarked on the new MIMIC-500 dataset.
Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging. Traditional rule-based labeling methods fall short of capturing the nuances of diverse free-text patterns. Moreover, models using expert-annotated data are limited by data scarcity and pre-defined classes, impacting their performance, flexibility and scalability. To address these issues, our study offers three main contributions: 1) We demonstrate the potential of GPT as an adept labeler using carefully designed prompts. 2) Utilizing only the data labeled by GPT, we trained a BERT-based labeler, CheX-GPT, which operates faster and more efficiently than its GPT counterpart. 3) To benchmark labeler performance, we introduced a publicly available expert-annotated test set, MIMIC-500, comprising 500 cases from the MIMIC validation set. Our findings demonstrate that CheX-GPT not only excels in labeling accuracy over existing models, but also showcases superior efficiency, flexibility, and scalability, supported by our introduction of the MIMIC-500 dataset for robust benchmarking. Code and models are available at https://github.com/Soombit-ai/CheXGPT.