CVAIOct 21, 2024

MI-VisionShot: Few-shot adaptation of vision-language models for slide-level classification of histopathological images

arXiv:2410.15881v13 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This work addresses slide-level classification in digital pathology, offering a more stable few-shot adaptation method for histopathological image analysis, though it is incremental as it builds on existing vision-language models and prototypical learning.

The paper tackled the problem of high variability in zero-shot transfer for slide-level classification of histopathological images by proposing MI-VisionShot, a training-free adaptation method that uses vision-language models and prototypical learning to create prototype-based classifiers, resulting in surpassing zero-shot transfer with lower variability in few-shot scenarios.

Vision-language supervision has made remarkable strides in learning visual representations from textual guidance. In digital pathology, vision-language models (VLM), pre-trained on curated datasets of histological image-captions, have been adapted to downstream tasks, such as region of interest classification. Zero-shot transfer for slide-level prediction has been formulated by MI-Zero, but it exhibits high variability depending on the textual prompts. Inspired by prototypical learning, we propose MI-VisionShot, a training-free adaptation method on top of VLMs to predict slide-level labels in few-shot learning scenarios. Our framework takes advantage of the excellent representation learning of VLM to create prototype-based classifiers under a multiple-instance setting by retrieving the most discriminative patches within each slide. Experimentation through different settings shows the ability of MI-VisionShot to surpass zero-shot transfer with lower variability, even in low-shot scenarios. Code coming soon at thttps://github.com/cvblab/MIVisionShot.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes