CLDec 7, 2025

Large Language Model-Based Generation of Discharge Summaries

arXiv:2512.06812v1h-index: 1Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need to reduce effort and errors for healthcare professionals by automating discharge summary generation, though it is incremental as it compares existing models on a specific task.

The study tackled the problem of automating discharge summary generation by evaluating five large language models on MIMIC-III data, finding that proprietary models like Gemini with one-shot prompting outperformed others in similarity to gold-standard summaries, while open-source models lagged due to issues like hallucinations.

Discharge Summaries are documents written by medical professionals that detail a patient's visit to a care facility. They contain a wealth of information crucial for patient care, and automating their generation could significantly reduce the effort required from healthcare professionals, minimize errors, and ensure that critical patient information is easily accessible and actionable. In this work, we explore the use of five Large Language Models on this task, from open-source models (Mistral, Llama 2) to proprietary systems (GPT-3, GPT-4, Gemini 1.5 Pro), leveraging MIMIC-III summaries and notes. We evaluate them using exact-match, soft-overlap, and reference-free metrics. Our results show that proprietary models, particularly Gemini with one-shot prompting, outperformed others, producing summaries with the highest similarity to the gold-standard ones. Open-source models, while promising, especially Mistral after fine-tuning, lagged in performance, often struggling with hallucinations and repeated information. Human evaluation by a clinical expert confirmed the practical utility of the summaries generated by proprietary models. Despite the challenges, such as hallucinations and missing information, the findings suggest that LLMs, especially proprietary models, are promising candidates for automatic discharge summary generation as long as data privacy is ensured.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes