Random Direct Preference Optimization for Radiography Report Generation
This addresses the workload of radiologists by enhancing report generation accuracy, though it appears incremental as it supplements existing models rather than proposing a new paradigm.
The paper tackles the problem of improving radiography report generation quality for clinical deployment by introducing a model-agnostic framework using Direct Preference Optimization with random contrastive sampling, which improves clinical performance metrics by up to 5% without additional training data.
Radiography Report Generation (RRG) has gained significant attention in medical image analysis as a promising tool for alleviating the growing workload of radiologists. However, despite numerous advancements, existing methods have yet to achieve the quality required for deployment in real-world clinical settings. Meanwhile, large Visual Language Models (VLMs) have demonstrated remarkable progress in the general domain by adopting training strategies originally designed for Large Language Models (LLMs), such as alignment techniques. In this paper, we introduce a model-agnostic framework to enhance RRG accuracy using Direct Preference Optimization (DPO). Our approach leverages random contrastive sampling to construct training pairs, eliminating the need for reward models or human preference annotations. Experiments on supplementing three state-of-the-art models with our Random DPO show that our method improves clinical performance metrics by up to 5%, without requiring any additional training data.