LGAICLCVSep 20, 2024

SLaVA-CXR: Small Language and Vision Assistant for Chest X-ray Report Automation

arXiv:2409.13321v17 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This addresses the need for privacy-compliant and resource-efficient medical AI tools, particularly for hospitals in low-resource settings, though it is incremental in applying existing LLM concepts to a specific domain.

The authors tackled the problem of automating chest X-ray report generation by developing SLaVA-CXR, an open-source small language and vision assistant, which outperforms previous state-of-the-art larger models and achieves 6 times faster inference efficiency.

Inspired by the success of large language models (LLMs), there is growing research interest in developing LLMs in the medical domain to assist clinicians. However, for hospitals, using closed-source commercial LLMs involves privacy issues, and developing open-source public LLMs requires large-scale computational resources, which are usually limited, especially in resource-efficient regions and low-income countries. We propose an open-source Small Language and Vision Assistant (SLaVA-CXR) that can be used for Chest X-Ray report automation. To efficiently train a small assistant, we first propose the Re$^3$Training method, which simulates the cognitive development of radiologists and optimizes the model in the Recognition, Reasoning, and Reporting training manner. Then, we introduce a data synthesis method, RADEX, which can generate a high-quality and diverse training corpus with privacy regulation compliance. The extensive experiments show that our SLaVA-CXR built on a 2.7B backbone not only outperforms but also achieves 6 times faster inference efficiency than previous state-of-the-art larger models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes