CVAIJul 24, 2025

PTCMIL: Multiple Instance Learning via Prompt Token Clustering for Whole Slide Image Analysis

arXiv:2507.18848v11 citationsh-index: 8Has CodeMICCAI
Originality Highly original
AI Analysis

This work addresses the problem of handling complexity and heterogeneity in whole slide images for medical imaging researchers, representing an incremental improvement over existing multiple instance learning methods.

The authors tackled the challenge of aggregating diverse patch information in whole slide image analysis by proposing PTCMIL, a method that integrates prompt token clustering with Vision Transformers, achieving superior performance in classification and survival analysis tasks across eight datasets.

Multiple Instance Learning (MIL) has advanced WSI analysis but struggles with the complexity and heterogeneity of WSIs. Existing MIL methods face challenges in aggregating diverse patch information into robust WSI representations. While ViTs and clustering-based approaches show promise, they are computationally intensive and fail to capture task-specific and slide-specific variability. To address these limitations, we propose PTCMIL, a novel Prompt Token Clustering-based ViT for MIL aggregation. By introducing learnable prompt tokens into the ViT backbone, PTCMIL unifies clustering and prediction tasks in an end-to-end manner. It dynamically aligns clustering with downstream tasks, using projection-based clustering tailored to each WSI, reducing complexity while preserving patch heterogeneity. Through token merging and prototype-based pooling, PTCMIL efficiently captures task-relevant patterns. Extensive experiments on eight datasets demonstrate its superior performance in classification and survival analysis tasks, outperforming state-of-the-art methods. Systematic ablation studies confirm its robustness and strong interpretability. The code is released at https://github.com/ubc-tea/PTCMIL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes