CVLGDec 19, 2025

CheXPO-v2: Preference Optimization for Chest X-ray VLMs with Knowledge Graph Consistency

arXiv:2512.17213v1h-index: 7Has Code
Originality Incremental advance
AI Analysis

This addresses safety risks in clinical applications by improving verifiable reasoning, though it is incremental as it builds on existing alignment methods.

The paper tackles hallucinations in medical Vision-Language Models by proposing CheXPO-v2, a framework that uses knowledge graph consistency rewards for process supervision, achieving state-of-the-art accuracy on benchmarks like MIMIC-CXR-VQA with only 5k samples.

Medical Vision-Language Models (VLMs) are prone to hallucinations, compromising clinical reliability. While reinforcement learning methods like Group Relative Policy Optimization (GRPO) offer a low-cost alignment solution, their reliance on sparse, outcome-based rewards inadvertently encourages models to "overthink" -- generating verbose, convoluted, and unverifiable Chain-of-Thought reasoning to justify answers. This focus on outcomes obscures factual errors and poses significant safety risks. To address this, we propose CheXPO-v2, a novel alignment framework that shifts from outcome to process supervision. Our core innovation is a Knowledge Graph Consistency Reward mechanism driven by Entity-Relation Matching. By explicitly parsing reasoning steps into structured "Disease, Relation, Anatomy" triplets, we provide fine-grained supervision that penalizes incoherent logic and hallucinations at the atomic level. Integrating this with a hard-example mining strategy, our approach significantly outperforms GRPO and state-of-the-art models on benchmarks like MIMIC-CXR-VQA. Crucially, CheXPO-v2 achieves new state-of-the-art accuracy using only 5k samples, demonstrating exceptional data efficiency while producing clinically sound and verifiable reasoning. The project source code is publicly available at: https://github.com/ecoxial2007/CheX-Phi4MM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes