CVJun 21, 2023

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

Weihao Gao, Zhuo Deng, Zhiyuan Niu, Fuju Rong, Chucheng Chen, Zheng Gong, Wenze Zhang, Daimin Xiao, Fang Li, Zhenjie Cao, Zhaoyi Ma, Wenbin Wei

arXiv:2306.12174v219.552 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for ophthalmology by creating a novel multimodal model, though it is incremental as it adapts existing LMM techniques to a medical niche.

The authors tackled the lack of multimodal large language models in ophthalmology by developing OphGLM, an assistant that integrates vision and language for disease diagnosis and lesion segmentation from fundus images, achieving exceptional performance with potential to revolutionize clinical applications.

Large multimodal language models (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic large language models have not been explored to date. In this paper, we study and construct an ophthalmic large multimodal model. Firstly, we use fundus images as an entry point to build a disease assessment and diagnosis pipeline to achieve common ophthalmic disease diagnosis and lesion segmentation. Then, we establish a new ophthalmic multimodal instruction-following and dialogue fine-tuning dataset based on disease-related knowledge data and publicly available real-world medical dialogue. We introduce visual ability into the large language model to complete the ophthalmic large language and vision assistant (OphGLM). Our experimental results demonstrate that the OphGLM model performs exceptionally well, and it has the potential to revolutionize clinical applications in ophthalmology. The dataset, code, and models will be made publicly available at https://github.com/ML-AILab/OphGLM.

View on arXiv PDF Code

Similar