CVApr 2, 2025

Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images

arXiv:2504.01838v12 citationsh-index: 11Has CodeISBI
Originality Incremental advance
AI Analysis

This addresses bias in medical AI for skin disease diagnosis, particularly for underrepresented patient subgroups, though it is incremental as it builds on existing generative methods.

The paper tackles bias in AI skin disease diagnosis by proposing a generative framework, DermDiT, which uses vision-language models to generate realistic dermoscopic images, improving representation of underrepresented groups in imbalanced datasets.

Artificial Intelligence (AI) in skin disease diagnosis has improved significantly, but a major concern is that these models frequently show biased performance across subgroups, especially regarding sensitive attributes such as skin color. To address these issues, we propose a novel generative AI-based framework, namely, Dermatology Diffusion Transformer (DermDiT), which leverages text prompts generated via Vision Language Models and multimodal text-image learning to generate new dermoscopic images. We utilize large vision language models to generate accurate and proper prompts for each dermoscopic image which helps to generate synthetic images to improve the representation of underrepresented groups (patient, disease, etc.) in highly imbalanced datasets for clinical diagnoses. Our extensive experimentation showcases the large vision language models providing much more insightful representations, that enable DermDiT to generate high-quality images. Our code is available at https://github.com/Munia03/DermDiT

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes