Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images
This work addresses computational efficiency and multi-scale representation for digital pathology, offering a domain-specific improvement that is incremental but with strong gains.
The paper tackled the problem of high computational demands and limited contextualization in Multiple Instance Learning (MIL) for Whole-Slide Image classification by proposing ZoomMIL, a method that learns multi-level zooming end-to-end, resulting in outperforming state-of-the-art MIL methods on two large datasets while reducing computational demands by up to 40x in FLOPs and processing time.
Multiple Instance Learning (MIL) methods have become increasingly popular for classifying giga-pixel sized Whole-Slide Images (WSIs) in digital pathology. Most MIL methods operate at a single WSI magnification, by processing all the tissue patches. Such a formulation induces high computational requirements, and constrains the contextualization of the WSI-level representation to a single scale. A few MIL methods extend to multiple scales, but are computationally more demanding. In this paper, inspired by the pathological diagnostic process, we propose ZoomMIL, a method that learns to perform multi-level zooming in an end-to-end manner. ZoomMIL builds WSI representations by aggregating tissue-context information from multiple magnifications. The proposed method outperforms the state-of-the-art MIL methods in WSI classification on two large datasets, while significantly reducing the computational demands with regard to Floating-Point Operations (FLOPs) and processing time by up to 40x.