CVCLNov 30, 2025

Generalized Medical Phrase Grounding

arXiv:2512.01085v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of interpreting radiological reports for non-experts by improving grounding accuracy in real-world scenarios, though it is incremental as it builds on existing paradigms.

The paper tackled the problem of medical phrase grounding by reformulating it to handle multi-region findings and non-groundable phrases, introducing MedGrounder which outperforms baselines on datasets like PadChest-GR and MS-CXR with strong zero-shot transfer and fewer human annotations.

Medical phrase grounding (MPG) maps textual descriptions of radiological findings to corresponding image regions. These grounded reports are easier to interpret, especially for non-experts. Existing MPG systems mostly follow the referring expression comprehension (REC) paradigm and return exactly one bounding box per phrase. Real reports often violate this assumption. They contain multi-region findings, non-diagnostic text, and non-groundable phrases, such as negations or descriptions of normal anatomy. Motivated by this, we reformulate the task as generalised medical phrase grounding (GMPG), where each sentence is mapped to zero, one, or multiple scored regions. To realise this formulation, we introduce the first GMPG model: MedGrounder. We adopted a two-stage training regime: pre-training on report sentence--anatomy box alignment datasets and fine-tuning on report sentence--human annotated box datasets. Experiments on PadChest-GR and MS-CXR show that MedGrounder achieves strong zero-shot transfer and outperforms REC-style and grounded report generation baselines on multi-region and non-groundable phrases, while using far fewer human box annotations. Finally, we show that MedGrounder can be composed with existing report generators to produce grounded reports without retraining the generator.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes