CLApr 7

MedLayBench-V: A Large-Scale Benchmark for Expert-Lay Semantic Alignment in Medical Vision Language Models

Han Jang, Junhyeok Lee, Heeseong Eum, Kyu Sung Choi

arXiv:2604.0573888.3

Predicted impact top 39% in CL · last 90 daysOriginality Synthesis-oriented

AI Analysis

This addresses the communication gap between clinical experts and patients in medical imaging, though it is incremental as it focuses on benchmark creation rather than a new model or method.

The paper tackles the problem of medical vision-language models being unable to communicate findings in lay terms for patient-centered care by introducing MedLayBench-V, a large-scale multimodal benchmark for expert-lay semantic alignment, constructed via a Structured Concept-Grounded Refinement pipeline to enforce semantic equivalence.

Medical Vision-Language Models (Med-VLMs) have achieved expert-level proficiency in interpreting diagnostic imaging. However, current models are predominantly trained on professional literature, limiting their ability to communicate findings in the lay register required for patient-centered care. While text-centric research has actively developed resources for simplifying medical jargon, there is a critical absence of large-scale multimodal benchmarks designed to facilitate lay-accessible medical image understanding. To bridge this resource gap, we introduce MedLayBench-V, the first large-scale multimodal benchmark dedicated to expert-lay semantic alignment. Unlike naive simplification approaches that risk hallucination, our dataset is constructed via a Structured Concept-Grounded Refinement (SCGR) pipeline. This method enforces strict semantic equivalence by integrating Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) with micro-level entity constraints. MedLayBench-V provides a verified foundation for training and evaluating next-generation Med-VLMs capable of bridging the communication divide between clinical experts and patients.

View on arXiv PDF

Similar