CVAILGSep 10, 2025

Retrieval-Augmented VLMs for Multimodal Melanoma Diagnosis

arXiv:2509.08338v1h-index: 2ISIC@MICCAI
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurate and early melanoma diagnosis for patients, offering a robust clinical decision support strategy, though it appears incremental as it builds on existing VLM approaches with retrieval augmentation.

The paper tackled the problem of improving melanoma diagnosis by addressing the limitations of existing methods that neglect clinical metadata and lack specificity, proposing a retrieval-augmented VLM framework that incorporates similar patient cases into prompts, resulting in significantly improved classification accuracy and error correction over conventional baselines.

Accurate and early diagnosis of malignant melanoma is critical for improving patient outcomes. While convolutional neural networks (CNNs) have shown promise in dermoscopic image analysis, they often neglect clinical metadata and require extensive preprocessing. Vision-language models (VLMs) offer a multimodal alternative but struggle to capture clinical specificity when trained on general-domain data. To address this, we propose a retrieval-augmented VLM framework that incorporates semantically similar patient cases into the diagnostic prompt. Our method enables informed predictions without fine-tuning and significantly improves classification accuracy and error correction over conventional baselines. These results demonstrate that retrieval-augmented prompting provides a robust strategy for clinical decision support.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes