CVAIJan 30

A Geometric Multimodal Foundation Model Integrating Bp-MRI and Clinical Reports in Prostate Cancer Classification

arXiv:2602.00214v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses prostate cancer diagnosis for clinicians by improving classification accuracy with multimodal data, though it is incremental as it builds on existing foundation models and Riemannian methods.

The paper tackled the problem of subjective and data-scarce prostate cancer classification by proposing MFM-Geom, a geometric multimodal foundation model that integrates bp-MRI and clinical reports, achieving an AUC-PR of 90.67 with only 10% of training data and 90.6 on an external dataset.

Prostate cancer (PCa) is one of the most common cancers in men worldwide. Bi-parametric MRI (bp-MRI) and clinical variables are crucial for PCa identification and improving treatment decisions. However, this process is subjective to expert interpretations. Furthermore, most existing computer-aided diagnosis methods focus on imaging-based models, overlooking the clinical context and suffering from data scarcity, limiting their ability to learn robust representations. We propose a geometric multimodal Foundation Model (FM), named MFM-Geom, that learns representations from bp-MRI and clinical reports, encoding visual findings and information from the context of clinical variables. In the representations classification head, the approach leverages symmetric positive definite (SPD) matrices and Riemannian deep learning to integrate imaging-text representations from a biomedical multimodal FM. Using 10% of the training data, MFM-Geom outperformed baseline class token embedding-based classification (+8.3%, AUC-PR of 90.67). Generalization on external dataset confirmed the robustness of fine-tuning biomedical FM, achieving an AUC-PR of 90.6.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes