CVSep 18, 2023

R2GenGPT: Radiology Report Generation with Frozen LLMs

arXiv:2309.09812v2202 citationsh-index: 101Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of efficient and effective radiology report generation for medical imaging, though it is incremental as it builds on existing LLM and alignment techniques.

The paper tackles the challenge of adapting large language models (LLMs) for radiology report generation by proposing R2GenGPT, which aligns visual features with LLM embeddings using a lightweight module, achieving state-of-the-art performance with only 5M parameters trained (0.07% of total).

Large Language Models (LLMs) have consistently showcased remarkable generalization capabilities when applied to various language tasks. Nonetheless, harnessing the full potential of LLMs for Radiology Report Generation (R2Gen) still presents a challenge, stemming from the inherent disparity in modality between LLMs and the R2Gen task. To bridge this gap effectively, we propose R2GenGPT, which is a novel solution that aligns visual features with the word embedding space of LLMs using an efficient visual alignment module. This innovative approach empowers the previously static LLM to seamlessly integrate and process image information, marking a step forward in optimizing R2Gen performance. R2GenGPT offers the following benefits. First, it attains state-of-the-art (SOTA) performance by training only the lightweight visual alignment module while freezing all the parameters of LLM. Second, it exhibits high training efficiency, as it requires the training of an exceptionally minimal number of parameters while achieving rapid convergence. By employing delta tuning, our model only trains 5M parameters (which constitute just 0.07\% of the total parameter count) to achieve performance close to the SOTA levels. Our code is available at https://github.com/wang-zhanyu/R2GenGPT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes