CVAICLJan 31, 2025

DermaSynth: Rich Synthetic Image-Text Pairs Using Open Access Dermatology Datasets

arXiv:2502.00196v2h-index: 22Has Code
Originality Synthesis-oriented
AI Analysis

This addresses a data bottleneck for AI researchers in dermatology, though it is incremental as it builds on existing open-access datasets.

The authors tackled the lack of large image-text datasets for dermatology vision LLMs by creating DermaSynth, a synthetic dataset of 92,020 pairs from 45,205 images, and fine-tuned a preliminary model, DermatoLlama 1.0, on 5,000 samples.

A major barrier to developing vision large language models (LLMs) in dermatology is the lack of large image--text pairs dataset. We introduce DermaSynth, a dataset comprising of 92,020 synthetic image--text pairs curated from 45,205 images (13,568 clinical and 35,561 dermatoscopic) for dermatology-related clinical tasks. Leveraging state-of-the-art LLMs, using Gemini 2.0, we used clinically related prompts and self-instruct method to generate diverse and rich synthetic texts. Metadata of the datasets were incorporated into the input prompts by targeting to reduce potential hallucinations. The resulting dataset builds upon open access dermatological image repositories (DERM12345, BCN20000, PAD-UFES-20, SCIN, and HIBA) that have permissive CC-BY-4.0 licenses. We also fine-tuned a preliminary Llama-3.2-11B-Vision-Instruct model, DermatoLlama 1.0, on 5,000 samples. We anticipate this dataset to support and accelerate AI research in dermatology. Data and code underlying this work are accessible at https://github.com/abdurrahimyilmaz/DermaSynth.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes