CVIVMay 16, 2019

Dealing with Label Scarcity in Computational Pathology: A Use Case in Prostate Cancer Classification

arXiv:1905.06820v18 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of expensive labeled data acquisition in computational pathology, though it is incremental as it compares existing methods in a specific domain.

The paper tackled the problem of label scarcity in computational pathology for prostate cancer classification, showing that semi-supervised and unsupervised methods outperform supervised learning with few labels and that incorporating immunohistochemistry data improves performance over using only H&E-stained images.

Large amounts of unlabelled data are commonplace for many applications in computational pathology, whereas labelled data is often expensive, both in time and cost, to acquire. We investigate the performance of unsupervised and supervised deep learning methods when few labelled data are available. Three methods are compared: clustering autoencoder latent vectors (unsupervised), a single layer classifier combined with a pre-trained autoencoder (semi-supervised), and a supervised CNN. We apply these methods on hematoxylin and eosin (H&E) stained prostatectomy images to classify tumour versus non-tumour tissue. Results show that semi-/unsupervised methods have an advantage over supervised learning when few labels are available. Additionally, we show that incorporating immunohistochemistry (IHC) stained data provides an increase in performance over only using H&E.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes