CVSep 18, 2025

Calibration-Aware Prompt Learning for Medical Vision-Language Models

Abhishek Basu, Fahad Shamshad, Ashshak Sharifdeen, Karthik Nandakumar, Muhammad Haris Khan

arXiv:2509.15226v18.41 citationsh-index: 14Has Code

Originality Incremental advance

AI Analysis

This addresses a critical issue for medical AI applications by enhancing trust and reliability in predictions, though it is incremental as it builds on existing prompt tuning methods.

The paper tackles the problem of confidence calibration in Medical Vision-Language Models (Med-VLMs), which can lead to overconfident errors in clinical settings, by introducing CalibPrompt, a framework that improves calibration without significantly reducing accuracy across multiple models and datasets.

Medical Vision-Language Models (Med-VLMs) have demonstrated remarkable performance across diverse medical imaging tasks by leveraging large-scale image-text pretraining. However, their confidence calibration is largely unexplored, and so remains a significant challenge. As such, miscalibrated predictions can lead to overconfident errors, undermining clinical trust and decision-making reliability. To address this, we introduce CalibPrompt, the first framework to calibrate Med-VLMs during prompt tuning. CalibPrompt optimizes a small set of learnable prompts with carefully designed calibration objectives under scarce labeled data regime. First, we study a regularizer that attempts to align the smoothed accuracy with the predicted model confidences. Second, we introduce an angular separation loss to maximize textual feature proximity toward improving the reliability in confidence estimates of multimodal Med-VLMs. Extensive experiments on four publicly available Med-VLMs and five diverse medical imaging datasets reveal that CalibPrompt consistently improves calibration without drastically affecting clean accuracy. Our code is available at https://github.com/iabh1shekbasu/CalibPrompt.

View on arXiv PDF Code

Similar