CVApr 14, 2023

Masked Pre-Training of Transformers for Histology Image Analysis

arXiv:2304.07434v125 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work addresses limited labeled data in digital pathology for cancer diagnosis and prognosis, representing an incremental advance in transformer-based methods for histology image analysis.

The authors tackled the challenge of applying transformer models to whole slide histology images by proposing MaskHIT, a masked pre-training method that learns features without labeled data, achieving improvements of 3% on survival prediction and 2% on cancer subtype classification over existing approaches.

In digital pathology, whole slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction. Visual transformer models have recently emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches. However, due to the large number of model parameters and limited labeled data, applying transformer models to WSIs remains challenging. Inspired by masked language models, we propose a pretext task for training the transformer model without labeled data to address this problem. Our model, MaskHIT, uses the transformer output to reconstruct masked patches and learn representative histological features based on their positions and visual features. The experimental results demonstrate that MaskHIT surpasses various multiple instance learning approaches by 3% and 2% on survival prediction and cancer subtype classification tasks, respectively. Furthermore, MaskHIT also outperforms two of the most recent state-of-the-art transformer-based methods. Finally, a comparison between the attention maps generated by the MaskHIT model with pathologist's annotations indicates that the model can accurately identify clinically relevant histological structures in each task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes