CVMay 6, 2025

Robust Fairness Vision-Language Learning for Medical Image Analysis

arXiv:2505.03153v13 citationsh-index: 11MIPR
AI Analysis

This addresses fairness and robustness issues in medical AI for patients, but it is incremental as it builds on existing VLM methods.

The paper tackled the problem of ensuring fairness and robustness in Vision-Language Models for medical image analysis by introducing a framework that modifies the loss function, resulting in up to an 8.6% improvement in equity-scaled AUC.

The advent of Vision-Language Models (VLMs) in medical image analysis has the potential to help process multimodal inputs and increase performance over traditional inference methods. However, when considering the domain in which these models will be implemented, fairness and robustness are important to ensure the model stays true for any patient. In this paper, we introduce a framework for ensuring robustness and fairness of VLM models. This framework modifies the loss function at training by identifying and adjusting faulty image-text pairs through a Dynamic Bad Pair Mining algorithm and also utilizing Sinkhorn distance to ensure the loss distributions of protected groups do not deviate from the total loss. Experimental testing of our framework shows up to a 8.6\% improvement when looking at equity-scaled AUC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes