Faithful Summarization of Consumer Health Queries: A Cross-Lingual Framework with LLMs
This work addresses the risk of unfaithful summaries in healthcare communication, which is crucial for patient safety, though it is incremental as it builds on existing methods like TextRank and LLMs.
The authors tackled the problem of generating faithful summaries for consumer health queries to prevent misrepresentation of medical details, achieving over 80% preservation of critical medical information in human evaluations and consistent improvements in quality and faithfulness metrics across English and Bangla datasets.
Summarizing consumer health questions (CHQs) can ease communication in healthcare, but unfaithful summaries that misrepresent medical details pose serious risks. We propose a framework that combines TextRank-based sentence extraction and medical named entity recognition with large language models (LLMs) to enhance faithfulness in medical text summarization. In our experiments, we fine-tuned the LLaMA-2-7B model on the MeQSum (English) and BanglaCHQ-Summ (Bangla) datasets, achieving consistent improvements across quality (ROUGE, BERTScore, readability) and faithfulness (SummaC, AlignScore) metrics, and outperforming zero-shot baselines and prior systems. Human evaluation further shows that over 80\% of generated summaries preserve critical medical information. These results highlight faithfulness as an essential dimension for reliable medical summarization and demonstrate the potential of our approach for safer deployment of LLMs in healthcare contexts.