Expert-Guided Prompting and Retrieval-Augmented Generation for Emergency Medical Service Question Answering
This work addresses the need for more accurate and reliable medical question answering in high-stakes emergency settings, though it is incremental as it builds on existing prompting and retrieval techniques.
The paper tackled the problem of large language models overlooking domain-specific expertise in emergency medical service question answering by introducing EMSQA, a dataset with 24.3K questions, and methods like Expert-CoT and ExpertRAG. The result was improvements of up to 2.05% over vanilla CoT prompting and up to 4.59% accuracy gain over standard RAG baselines, with 32B expertise-augmented LLMs passing all EMS certification simulation exams.
Large language models (LLMs) have shown promise in medical question answering, yet they often overlook the domain-specific expertise that professionals depend on, such as the clinical subject areas (e.g., trauma, airway) and the certification level (e.g., EMT, Paramedic). Existing approaches typically apply general-purpose prompting or retrieval strategies without leveraging this structured context, limiting performance in high-stakes settings. We address this gap with EMSQA, an 24.3K-question multiple-choice dataset spanning 10 clinical subject areas and 4 certification levels, accompanied by curated, subject area-aligned knowledge bases (40K documents and 2M tokens). Building on EMSQA, we introduce (i) Expert-CoT, a prompting strategy that conditions chain-of-thought (CoT) reasoning on specific clinical subject area and certification level, and (ii) ExpertRAG, a retrieval-augmented generation pipeline that grounds responses in subject area-aligned documents and real-world patient data. Experiments on 4 LLMs show that Expert-CoT improves up to 2.05% over vanilla CoT prompting. Additionally, combining Expert-CoT with ExpertRAG yields up to a 4.59% accuracy gain over standard RAG baselines. Notably, the 32B expertise-augmented LLMs pass all the computer-adaptive EMS certification simulation exams.