CL LGMay 2, 2024

Prompt engineering paradigms for medical applications: scoping review and recommendations for better practices

Jamil Zaghir, Marco Naguib, Mina Bjelogrlic, Aurélie Névéol, Xavier Tannier, Christian Lovis

arXiv:2405.01249v19.699 citationsh-index: 25J Med Internet Res

Originality Synthesis-oriented

AI Analysis

This work addresses the need for standardized practices in prompt engineering for medical applications, but it is incremental as it synthesizes existing research without introducing new methods.

The paper reviewed 114 recent studies on prompt engineering in medicine, finding that prompt design is the most common approach and that many studies lack non-prompt baselines, leading to recommendations for better practices.

Prompt engineering is crucial for harnessing the potential of large language models (LLMs), especially in the medical domain where specialized terminology and phrasing is used. However, the efficacy of prompt engineering in the medical domain remains to be explored. In this work, 114 recent studies (2022-2024) applying prompt engineering in medicine, covering prompt learning (PL), prompt tuning (PT), and prompt design (PD) are reviewed. PD is the most prevalent (78 articles). In 12 papers, PD, PL, and PT terms were used interchangeably. ChatGPT is the most commonly used LLM, with seven papers using it for processing sensitive clinical data. Chain-of-Thought emerges as the most common prompt engineering technique. While PL and PT articles typically provide a baseline for evaluating prompt-based approaches, 64% of PD studies lack non-prompt-related baselines. We provide tables and figures summarizing existing work, and reporting recommendations to guide future research contributions.

View on arXiv PDF

Similar