CVCLMay 5, 2025

AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation

arXiv:2505.02830v110 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses the problem of improving diagnostic accuracy and interpretability for clinicians in medical imaging, representing an incremental advancement by enhancing existing models with anatomy-centric reasoning.

The paper tackled the challenges of insufficient region-level understanding and limited accuracy in Medical Large Multimodal Models for chest X-ray interpretation by introducing an Anatomical Ontology-Guided Reasoning framework, which demonstrated superior performance in VQA and report generation tasks.

Chest X-rays (CXRs) are the most frequently performed imaging examinations in clinical settings. Recent advancements in Large Multimodal Models (LMMs) have enabled automated CXR interpretation, enhancing diagnostic accuracy and efficiency. However, despite their strong visual understanding, current Medical LMMs (MLMMs) still face two major challenges: (1) Insufficient region-level understanding and interaction, and (2) Limited accuracy and interpretability due to single-step reasoning. In this paper, we empower MLMMs with anatomy-centric reasoning capabilities to enhance their interactivity and explainability. Specifically, we first propose an Anatomical Ontology-Guided Reasoning (AOR) framework, which centers on cross-modal region-level information to facilitate multi-step reasoning. Next, under the guidance of expert physicians, we develop AOR-Instruction, a large instruction dataset for MLMMs training. Our experiments demonstrate AOR's superior performance in both VQA and report generation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes