CLApr 3, 2023

DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task

arXiv:2304.01097v2230 citationsh-index: 43Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of deploying medical LLMs in Chinese hospitals, though it is an incremental engineering attempt with acknowledged limitations.

The authors tackled the problem of large language models (LLMs) performing poorly in Chinese medical domains by fine-tuning ChatGLM-6B on Chinese medical dialogues, achieving this on a single A100 80G GPU in 13 hours to make healthcare-purpose LLMs more affordable.

The recent progress of large language models (LLMs), including ChatGPT and GPT-4, in comprehending and responding to human instructions has been remarkable. Nevertheless, these models typically perform better in English and have not been explicitly trained for the medical domain, resulting in suboptimal precision in diagnoses, drug recommendations, and other medical advice. Additionally, training and deploying a dialogue model is still believed to be impossible for hospitals, hindering the promotion of LLMs. To tackle these challenges, we have collected databases of medical dialogues in Chinese with ChatGPT's help and adopted several techniques to train an easy-deploy LLM. Remarkably, we were able to fine-tune the ChatGLM-6B on a single A100 80G in 13 hours, which means having a healthcare-purpose LLM can be very affordable. DoctorGLM is currently an early-stage engineering attempt and contain various mistakes. We are sharing it with the broader community to invite feedback and suggestions to improve its healthcare-focused capabilities: https://github.com/xionghonglin/DoctorGLM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes