ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models
This addresses the problem of making medical CAD systems more understandable for patients by combining LLMs' medical knowledge with existing vision models, though it appears incremental as it builds on established CAD networks.
The paper tackles the challenge of integrating large language models (LLMs) into medical-image computer-aided diagnosis (CAD) networks to enhance their outputs, such as diagnosis and lesion segmentation, by summarizing information in natural language for improved user-friendliness.
Large language models (LLMs) have recently demonstrated their potential in clinical applications, providing valuable medical knowledge and advice. For example, a large dialog LLM like ChatGPT has successfully passed part of the US medical licensing exam. However, LLMs currently have difficulty processing images, making it challenging to interpret information from medical images, which are rich in information that supports clinical decisions. On the other hand, computer-aided diagnosis (CAD) networks for medical images have seen significant success in the medical field by using advanced deep-learning algorithms to support clinical decision-making. This paper presents a method for integrating LLMs into medical-image CAD networks. The proposed framework uses LLMs to enhance the output of multiple CAD networks, such as diagnosis networks, lesion segmentation networks, and report generation networks, by summarizing and reorganizing the information presented in natural language text format. The goal is to merge the strengths of LLMs' medical domain knowledge and logical reasoning with the vision understanding capability of existing medical-image CAD models to create a more user-friendly and understandable system for patients compared to conventional CAD systems. In the future, LLM's medical knowledge can be also used to improve the performance of vision-based medical-image CAD models.