CLFeb 21, 2024

The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models

Zheheng Luo, Qianqian Xie, Sophia Ananiadou

arXiv:2402.13498v12 citationsh-index: 15

Originality Incremental advance

AI Analysis

This work addresses the challenge of making complex biomedical information accessible to non-experts, representing an incremental advancement by leveraging LLMs for both generation and evaluation in lay summarization.

The paper tackled the problem of automated lay summarization (LS) of biomedical articles, which often struggles with simplification and lacks evaluation metrics for 'layness', by proposing an Explain-then-Summarise framework using large language models (LLMs) to generate background knowledge and novel evaluation metrics, resulting in improved supervised LS and high alignment with human preferences.

Automated lay summarisation (LS) aims to simplify complex technical documents into a more accessible format to non-experts. Existing approaches using pre-trained language models, possibly augmented with external background knowledge, tend to struggle with effective simplification and explanation. Moreover, automated methods that can effectively assess the `layness' of generated summaries are lacking. Recently, large language models (LLMs) have demonstrated a remarkable capacity for text simplification, background information generation, and text evaluation. This has motivated our systematic exploration into using LLMs to generate and evaluate lay summaries of biomedical articles. We propose a novel \textit{Explain-then-Summarise} LS framework, which leverages LLMs to generate high-quality background knowledge to improve supervised LS. We also evaluate the performance of LLMs for zero-shot LS and propose two novel LLM-based LS evaluation metrics, which assess layness from multiple perspectives. Finally, we conduct a human assessment of generated lay summaries. Our experiments reveal that LLM-generated background information can support improved supervised LS. Furthermore, our novel zero-shot LS evaluation metric demonstrates a high degree of alignment with human preferences. We conclude that LLMs have an important part to play in improving both the performance and evaluation of LS methods.

View on arXiv PDF

Similar