CLAIJun 27, 2024

Two-Pronged Human Evaluation of ChatGPT Self-Correction in Radiology Report Simplification

arXiv:2406.18859v126 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need to share radiology reports with patients by simplifying them, though it is incremental in applying existing prompting methods to a specific domain.

This study tackled the problem of automatically generating patient-friendly simplifications of technical radiology reports using large language models, finding that self-correction prompting was effective in producing high-quality simplifications as evaluated by radiologists and laypeople.

Radiology reports are highly technical documents aimed primarily at doctor-doctor communication. There has been an increasing interest in sharing those reports with patients, necessitating providing them patient-friendly simplifications of the original reports. This study explores the suitability of large language models in automatically generating those simplifications. We examine the usefulness of chain-of-thought and self-correction prompting mechanisms in this domain. We also propose a new evaluation protocol that employs radiologists and laypeople, where radiologists verify the factual correctness of simplifications, and laypeople assess simplicity and comprehension. Our experimental results demonstrate the effectiveness of self-correction prompting in producing high-quality simplifications. Our findings illuminate the preferences of radiologists and laypeople regarding text simplification, informing future research on this topic.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes