CVFeb 10

Singpath-VL Technical Report

arXiv:2602.09523v1Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem in cytopathology, providing a specialized tool for medical professionals, but it is incremental as it builds on existing MLLM methods.

The authors tackled the lack of AI assistants in cervical cytology by developing Singpath-VL, a vision-language model, which achieved superior performance in fine-grained morphological perception and cell-level diagnostic classification.

We present Singpath-VL, a vision-language large model, to fill the vacancy of AI assistant in cervical cytology. Recent advances in multi-modal large language models (MLLMs) have significantly propelled the field of computational pathology. However, their application in cytopathology, particularly cervical cytology, remains underexplored, primarily due to the scarcity of large-scale, high-quality annotated datasets. To bridge this gap, we first develop a novel three-stage pipeline to synthesize a million-scale image-description dataset. The pipeline leverages multiple general-purpose MLLMs as weak annotators, refines their outputs through consensus fusion and expert knowledge injection, and produces high-fidelity descriptions of cell morphology. Using this dataset, we then fine-tune the Qwen3-VL-4B model via a multi-stage strategy to create a specialized cytopathology MLLM. The resulting model, named Singpath-VL, demonstrates superior performance in fine-grained morphological perception and cell-level diagnostic classification. To advance the field, we will open-source a portion of the synthetic dataset and benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes