MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical Question Answering
This work addresses the challenge of adapting black-box LLMs for domain-specific medical QA, offering a practical, incremental improvement without fine-tuning.
The paper tackled the problem of poor performance of large language models (LLMs) on medical question answering by using retrieval augmented generation (RAG) to extract and inject medical facts into prompts, resulting in an accuracy improvement from 44.46% to 48.54% on the MedQA-SMILE dataset.
Large Language Models (LLMs), although powerful in general domains, often perform poorly on domain-specific tasks such as medical question answering (QA). In addition, LLMs tend to function as "black-boxes", making it challenging to modify their behavior. To address the problem, our work employs a transparent process of retrieval augmented generation (RAG), aiming to improve LLM responses without the need for fine-tuning or retraining. Specifically, we propose a comprehensive retrieval strategy to extract medical facts from an external knowledge base, and then inject them into the LLM's query prompt. Focusing on medical QA, we evaluate the impact of different retrieval models and the number of facts on LLM performance using the MedQA-SMILE dataset. Notably, our retrieval-augmented Vicuna-7B model exhibited an accuracy improvement from 44.46% to 48.54%. This work underscores the potential of RAG to enhance LLM performance, offering a practical approach to mitigate the challenges posed by black-box LLMs.