CVMMNov 2, 2025

Med-Banana-50K: A Cross-modality Large-Scale Dataset for Text-guided Medical Image Editing

arXiv:2511.00801v32 citationsh-index: 1Has Code
Originality Synthesis-oriented
AI Analysis

This dataset addresses a critical bottleneck for researchers and practitioners in medical AI by providing a foundational resource for developing and evaluating reliable medical image editing systems, though it is incremental as it builds on existing data creation methods.

The authors tackled the lack of large-scale, high-quality datasets for medical image editing by introducing Med-Banana-50K, a dataset of over 50k medically curated image edits across 23 diseases, which includes around 37,000 failed attempts to support research.

Medical image editing has emerged as a pivotal technology with broad applications in data augmentation, model interpretability, medical education, and treatment simulation. However, the lack of large-scale, high-quality, and openly accessible datasets tailored for medical contexts with strict anatomical and clinical constraints has significantly hindered progress in this domain. To bridge this gap, we introduce Med-Banana-50K, a comprehensive dataset of over 50k medically curated image edits spanning chest X-ray, brain MRI, and fundus photography across 23 diseases. Each sample supports bidirectional lesion editing (addition and removal) and is constructed using Gemini-2.5-Flash-Image based on real clinical images. A key differentiator of our dataset is the medically grounded quality control protocol: we employ an LLM-as-Judge evaluation framework with criteria such as instruction compliance, structural plausibility, image realism, and fidelity preservation, alongside iterative refinement over up to five rounds. Additionally, Med-Banana-50K includes around 37,000 failed editing attempts with full evaluation logs to support preference learning and alignment research. By offering a large-scale, medically rigorous, and fully documented resource, Med-Banana-50K establishes a critical foundation for developing and evaluating reliable medical image editing systems. Our dataset and code are publicly available. [https://github.com/richardChenzhihui/med-banana-50k].

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes