CVMay 10, 2025

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

arXiv:2505.06710v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in MIL for medical imaging by improving feature extraction, though it is incremental as it builds on existing MIL methods.

The paper tackles the problem of weak instance-level representation learning in multi-instance learning (MIL) for whole-slide pathology images by proposing SimMIL, a weakly supervised pre-training framework that propagates bag-level labels to instances; it achieves better performance than existing pre-training schemes like ImageNet and self-supervised learning on large-scale datasets.

Various multi-instance learning (MIL) based approaches have been developed and successfully applied to whole-slide pathological images (WSI). Existing MIL methods emphasize the importance of feature aggregators, but largely neglect the instance-level representation learning. They assume that the availability of a pre-trained feature extractor can be directly utilized or fine-tuned, which is not always the case. This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme, i.e., propagating the weak bag-level labels to the corresponding instances for supervised learning. To learn effective features for MIL, we further delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function. We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes (e.g., ImageNet pre-training and self-supervised learning) in different downstream tasks. We further show the compatibility and scalability of the proposed scheme by deploying it in fine-tuning the pathological-specific models and pre-training on merged multiple datasets. To our knowledge, this is the first work focusing on the representation learning for MIL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes