CVJun 21, 2023

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

arXiv:2306.12174v252 citationsh-index: 10Has Code
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for ophthalmology by creating a novel multimodal model, though it is incremental as it adapts existing LMM techniques to a medical niche.

The authors tackled the lack of multimodal large language models in ophthalmology by developing OphGLM, an assistant that integrates vision and language for disease diagnosis and lesion segmentation from fundus images, achieving exceptional performance with potential to revolutionize clinical applications.

Large multimodal language models (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic large language models have not been explored to date. In this paper, we study and construct an ophthalmic large multimodal model. Firstly, we use fundus images as an entry point to build a disease assessment and diagnosis pipeline to achieve common ophthalmic disease diagnosis and lesion segmentation. Then, we establish a new ophthalmic multimodal instruction-following and dialogue fine-tuning dataset based on disease-related knowledge data and publicly available real-world medical dialogue. We introduce visual ability into the large language model to complete the ophthalmic large language and vision assistant (OphGLM). Our experimental results demonstrate that the OphGLM model performs exceptionally well, and it has the potential to revolutionize clinical applications in ophthalmology. The dataset, code, and models will be made publicly available at https://github.com/ML-AILab/OphGLM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes