CLJul 23, 2024

Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models

Ioana Buhnila, Aman Sinha, Mathieu Constant

arXiv:2407.16565v114.127 citationsh-index: 3Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for resource-efficient and accurate medical text generation, particularly for French, but is incremental as it adapts existing methods to a specific domain.

The authors tackled the problem of generating medical paraphrases by introducing pRAGe, a pipeline using small language models with retrieval-augmented generation, and found it effective for French medical text with improved factual grounding and reduced computational demands.

Recent surge in the accessibility of large language models (LLMs) to the general population can lead to untrackable use of such models for medical-related recommendations. Language generation via LLMs models has two key problems: firstly, they are prone to hallucination and therefore, for any medical purpose they require scientific and factual grounding; secondly, LLMs pose tremendous challenge to computational resources due to their gigantic model size. In this work, we introduce pRAGe, a pipeline for Retrieval Augmented Generation and evaluation of medical paraphrases generation using Small Language Models (SLM). We study the effectiveness of SLMs and the impact of external knowledge base for medical paraphrase generation in French.

View on arXiv PDF Code

Similar