CLApr 27, 2023

PMC-LLaMA: Towards Building Open-source Language Models for Medicine

Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie

Harvard

arXiv:2304.14454v316.6114 citationsh-index: 50Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the need for accurate, domain-specific AI in medicine, offering an open-source alternative to proprietary models, though it is incremental as it adapts existing methods to a new domain.

The paper tackles the problem of large language models lacking precision in medical applications by building PMC-LLaMA, an open-source model with 13 billion parameters that integrates 4.8M biomedical papers and 30K textbooks, achieving superior performance on medical QA benchmarks and surpassing ChatGPT.

Recently, Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this paper, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA. Our contributions are threefold: (i) we systematically investigate the process of adapting a general-purpose foundation language model towards medical domain, this involves data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive fine-tuning for alignment with domain-specific instructions; (ii) we contribute a large-scale, comprehensive dataset for instruction tuning. This dataset encompasses medical question-answering (QA), rationale for reasoning, and conversational dialogues, comprising a total of 202M tokens; (iii) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component. While evaluating on various public medical question-answering benchmarks, our lightweight PMCLLaMA, which consists of only 13 billion parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, datasets can be found in https://github.com/chaoyi-wu/PMC-LLaMA.

View on arXiv PDF Code

Similar