CLMay 22, 2023

A Study of Generative Large Language Model for Medical Research and Healthcare

Cheng Peng, Xi Yang, Aokun Chen, Kaleb E Smith, Nima PourNejatian, Anthony B Costa, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc, Gloria Lipori, Duane A Mitchell

arXiv:2305.13523v122.7451 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for specialized LLMs in healthcare, offering a domain-specific tool that could enhance medical research and applications, though it is incremental as it adapts an existing architecture to clinical data.

The study developed GatorTronGPT, a clinical generative LLM, to improve biomedical NLP for medical research, showing that synthetic models trained with its generated text outperform those using real clinical text and that physicians could not differentiate its output from human text in readability and relevance.

There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.

View on arXiv PDF Code

Similar