Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
This addresses the problem of timely access to medical literature for clinicians and researchers, but it is incremental as it builds on existing retrieval-augmented LLM methods with new tools and benchmarks.
The authors tackled the challenge of clinicians and researchers keeping up with medical literature by developing Clinfo.ai, an open-source retrieval-augmented LLM system for answering medical questions, and reported benchmark results on a new dataset, PubMedRS-200, showing competitive performance with other OpenQA systems.
The quickly-expanding nature of published medical literature makes it challenging for clinicians and researchers to keep up with and summarize recent, relevant findings in a timely manner. While several closed-source summarization tools based on large language models (LLMs) now exist, rigorous and systematic evaluations of their outputs are lacking. Furthermore, there is a paucity of high-quality datasets and appropriate benchmark tasks with which to evaluate these tools. We address these issues with four contributions: we release Clinfo.ai, an open-source WebApp that answers clinical questions based on dynamically retrieved scientific literature; we specify an information retrieval and abstractive summarization task to evaluate the performance of such retrieval-augmented LLM systems; we release a dataset of 200 questions and corresponding answers derived from published systematic reviews, which we name PubMed Retrieval and Synthesis (PubMedRS-200); and report benchmark results for Clinfo.ai and other publicly available OpenQA systems on PubMedRS-200.