CVJan 7

HemBLIP: A Vision-Language Model for Interpretable Leukemia Cell Morphology Analysis

arXiv:2601.03915v11.5h-index: 10

Originality Incremental advance

AI Analysis

This work addresses the need for transparent and scalable hematological diagnostics for clinicians, though it is incremental as it adapts existing vision-language models to a specific medical domain.

The researchers tackled the problem of black-box deep learning models in leukemia diagnosis by developing HemBLIP, a vision-language model that generates interpretable descriptions of blood cell morphology, achieving higher caption quality and morphological accuracy than the biomedical foundation model MedGEMMA while LoRA adaptation provided further gains with reduced computational cost.

Microscopic evaluation of white blood cell morphology is central to leukemia diagnosis, yet current deep learning models often act as black boxes, limiting clinical trust and adoption. We introduce HemBLIP, a vision language model designed to generate interpretable, morphology aware descriptions of peripheral blood cells. Using a newly constructed dataset of 14k healthy and leukemic cells paired with expert-derived attribute captions, we adapt a general-purpose VLM via both full fine-tuning and LoRA based parameter efficient training, and benchmark against the biomedical foundation model MedGEMMA. HemBLIP achieves higher caption quality and morphological accuracy, while LoRA adaptation provides further gains with significantly reduced computational cost. These results highlight the promise of vision language models for transparent and scalable hematological diagnostics.

View on arXiv PDF

Similar