CVSep 18, 2025

MedFact-R1: Towards Factual Medical Reasoning via Pseudo-Label Augmentation

Gengliang Li, Rongyu Chen, Bin Li, Linlin Yang, Guodong Ding

arXiv:2509.15154v16.21 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of unreliable reasoning in medical AI for healthcare applications, representing a strong specific gain rather than a broad paradigm shift.

The paper tackles the challenge of ensuring factual consistency in medical vision-language models by introducing MEDFACT-R1, a two-stage framework that integrates external knowledge grounding with reinforcement learning, resulting in up to 22.5% absolute improvement in factual accuracy over previous state-of-the-art methods on medical QA benchmarks.

Ensuring factual consistency and reliable reasoning remains a critical challenge for medical vision-language models. We introduce MEDFACT-R1, a two-stage framework that integrates external knowledge grounding with reinforcement learning to improve the factual medical reasoning. The first stage uses pseudo-label supervised fine-tuning (SFT) to incorporate external factual expertise; while the second stage applies Group Relative Policy Optimization (GRPO) with four tailored factual reward signals to encourage self-consistent reasoning. Across three public medical QA benchmarks, MEDFACT-R1 delivers up to 22.5% absolute improvement in factual accuracy over previous state-of-the-art methods. Ablation studies highlight the necessity of pseudo-label SFT cold start and validate the contribution of each GRPO reward, underscoring the synergy between knowledge grounding and RL-driven reasoning for trustworthy medical AI. Codes are released at https://github.com/Garfieldgengliang/MEDFACT-R1.

View on arXiv PDF Code

Similar