DSNet: A Dual-Stream Framework for Weakly-Supervised Gigapixel Pathology Image Analysis
This addresses the challenge of expensive annotations in pathology image analysis for medical applications, though it is incremental as it builds on existing weakly-supervised approaches.
The authors tackled the problem of classifying whole slide images (WSIs) with only image-level labels, which avoids expensive patch-level annotations, by proposing a dual-stream framework that integrates local and regional information. Their method outperformed all recent state-of-the-art weakly-supervised WSI classification methods on two large-scale public datasets.
We present a novel weakly-supervised framework for classifying whole slide images (WSIs). WSIs, due to their gigapixel resolution, are commonly processed by patch-wise classification with patch-level labels. However, patch-level labels require precise annotations, which is expensive and usually unavailable on clinical data. With image-level labels only, patch-wise classification would be sub-optimal due to inconsistency between the patch appearance and image-level label. To address this issue, we posit that WSI analysis can be effectively conducted by integrating information at both high magnification (local) and low magnification (regional) levels. We auto-encode the visual signals in each patch into a latent embedding vector representing local information, and down-sample the raw WSI to hardware-acceptable thumbnails representing regional information. The WSI label is then predicted with a Dual-Stream Network (DSNet), which takes the transformed local patch embeddings and multi-scale thumbnail images as inputs and can be trained by the image-level label only. Experiments conducted on two large-scale public datasets demonstrate that our method outperforms all recent state-of-the-art weakly-supervised WSI classification methods.