Pareto-Optimized Open-Source LLMs for Healthcare via Context Retrieval
It enables more affordable and reliable LLM solutions for healthcare, though it is incremental as it builds on existing retrieval techniques.
This study tackled the problem of high costs in healthcare AI by enhancing open-source LLMs with optimized context retrieval, achieving state-of-the-art accuracy on medical question answering at a fraction of the cost of proprietary models, as demonstrated on the MedQA benchmark.
This study leverages optimized context retrieval to enhance open-source Large Language Models (LLMs) for cost-effective, high performance healthcare AI. We demonstrate that this approach achieves state-of-the-art accuracy on medical question answering at a fraction of the cost of proprietary models, significantly improving the cost-accuracy Pareto frontier on the MedQA benchmark. Key contributions include: (1) OpenMedQA, a novel benchmark revealing a performance gap in open-ended medical QA compared to multiple-choice formats; (2) a practical, reproducible pipeline for context retrieval optimization; and (3) open-source resources (Prompt Engine, CoT/ToT/Thinking databases) to empower healthcare AI development. By advancing retrieval techniques and QA evaluation, we enable more affordable and reliable LLM solutions for healthcare.