CVSep 30, 2025

ProbMed: A Probabilistic Framework for Medical Multimodal Binding

arXiv:2509.25711v1h-index: 3
Originality Highly original
AI Analysis

This addresses the need for better integration of diverse medical modalities like imaging and text for enhanced diagnostic decision-making, representing a novel method for a known bottleneck.

The paper tackled the problem of many-to-many mapping in medical multimodal data by introducing ProbMED, a probabilistic framework that improved cross-modality retrieval and classification, outperforming current models across 13 datasets.

Medical decision-making requires integrating diverse medical information, from imaging to clinical narratives. These medical modalities are often acquired in a many-to-many manner. However, current medical vision-language pretraining models (Med-VLPMs) fail to directly account for this many-to-many mapping in their model training and embeddings. To address this, we present Probabilistic Modality-Enhanced Diagnosis (ProbMED), a multimodal Med-VLPM that employs probabilistic contrastive learning to model distributions over embeddings rather than deterministic estimates. ProbMED aligns four distinct modalities -- chest X-rays, electrocardiograms, echocardiograms, and clinical text -- into a unified probabilistic embedding space. We use InfoNCE loss with Hellinger distance to integrate inter-modality distributions. We introduce a probabilistic synthetic sampling loss that captures modality-specific mean and variance to improve intra-modality binding. Extensive experiments across 13 medical datasets demonstrate that our model outperforms current Med-VLPMs in cross-modality retrieval, zero-shot, and few-shot classification. We also demonstrate the robust integration of multiple modalities for prognostication, showing improved intra- and inter-medical modality binding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes