Using Pretrained Large Language Model with Prompt Engineering to Answer Biomedical Questions
This work addresses the problem of accurate biomedical information retrieval and question answering for researchers and practitioners, but it is incremental as it applies existing LLM methods to a specific domain.
The paper tackled biomedical question answering by developing a system using pre-trained large language models with prompt engineering and post-processing, achieving scores like 0.96 F1 for yes/no questions and 0.38 MRR for factoid questions in the BioASQ 2024 challenge.
Our team participated in the BioASQ 2024 Task12b and Synergy tasks to build a system that can answer biomedical questions by retrieving relevant articles and snippets from the PubMed database and generating exact and ideal answers. We propose a two-level information retrieval and question-answering system based on pre-trained large language models (LLM), focused on LLM prompt engineering and response post-processing. We construct prompts with in-context few-shot examples and utilize post-processing techniques like resampling and malformed response detection. We compare the performance of various pre-trained LLM models on this challenge, including Mixtral, OpenAI GPT and Llama2. Our best-performing system achieved 0.14 MAP score on document retrieval, 0.05 MAP score on snippet retrieval, 0.96 F1 score for yes/no questions, 0.38 MRR score for factoid questions and 0.50 F1 score for list questions in Task 12b.