CVNov 16, 2023
Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task ModelRaphael Schäfer, Till Nicke, Henning Höfener et al.
Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more heterogeneous datasets common in biomedical imaging. Here, we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a Universal bioMedical PreTrained model (UMedPT) on a multi-task database including tomographic, microscopic, and X-ray images, with various labelling strategies such as classification, segmentation, and object detection. The UMedPT foundational model outperformed ImageNet pretraining and the previous state-of-the-art models. For tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For out-of-domain tasks it required not more than 50% of the original training data. In an external independent validation imaging features extracted using UMedPT proved to be a new standard for cross-center transferability.
IVSep 5, 2024
Tissue Concepts: supervised foundation models in computational pathologyTill Nicke, Jan Raphael Schaefer, Henning Hoefener et al.
Due to the increasing workload of pathologists, the need for automation to support diagnostic tasks and quantitative biomarker evaluation is becoming more and more apparent. Foundation models have the potential to improve generalizability within and across centers and serve as starting points for data efficient development of specialized yet robust AI models. However, the training foundation models themselves is usually very expensive in terms of data, computation, and time. This paper proposes a supervised training method that drastically reduces these expenses. The proposed method is based on multi-task learning to train a joint encoder, by combining 16 different classification, segmentation, and detection tasks on a total of 912,000 patches. Since the encoder is capable of capturing the properties of the samples, we term it the Tissue Concepts encoder. To evaluate the performance and generalizability of the Tissue Concepts encoder across centers, classification of whole slide images from four of the most prevalent solid cancers - breast, colon, lung, and prostate - was used. The experiments show that the Tissue Concepts model achieve comparable performance to models trained with self-supervision, while requiring only 6% of the amount of training patches. Furthermore, the Tissue Concepts encoder outperforms an ImageNet pre-trained encoder on both in-domain and out-of-domain data.
IVJul 8, 2025
Tissue Concepts v2: A Supervised Foundation Model For Whole Slide ImagesTill Nicke, Daniela Schacherer, Jan Raphael Schäfer et al.
Foundation models (FMs) are transforming the field of computational pathology by offering new approaches to analyzing histopathology images. Typically relying on weeks of training on large databases, the creation of FMs is a resource-intensive process in many ways. In this paper, we introduce the extension of our supervised foundation model, Tissue Concepts, to whole slide images, called Tissue Concepts v2 (TCv2), a supervised foundation model for whole slide images to address the issue above. TCv2 uses supervised, end-to-end multitask learning on slide-level labels. Training TCv2 uses a fraction of the training resources compared to self-supervised training. The presented model shows superior performance compared to SSL-trained models in cancer subtyping benchmarks and is fully trained on freely available data. Furthermore, a shared trained attention module provides an additional layer of explainability across different tasks.
IVMay 29, 2023
The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer TissuePhilippe Weitz, Masi Valkonen, Leslie Solorzano et al.
The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration dataset to date, including 4,212 WSIs from 1,152 breast cancer patients. The challenge objective was to align WSIs of tissue that was stained with routine diagnostic immunohistochemistry to its H&E-stained counterpart. We compare the performance of eight WSI registration algorithms, including an investigation of the impact of different WSI properties and clinical covariates. We find that conceptually distinct WSI registration methods can lead to highly accurate registration performances and identify covariates that impact performances across methods. These results establish the current state-of-the-art in WSI registration and guide researchers in selecting and developing methods.
QMFeb 24, 2022
Deep Learning based Prediction of MSI using MMR Markers in Colorectal CancerRuqayya Awan, Mohammed Nimir, Shan E Ahmed Raza et al.
The accurate diagnosis and molecular profiling of colorectal cancers are critical for planning the best treatment options for patients. Microsatellite instability (MSI) or mismatch repair (MMR) status plays a vital role in appropriate treatment selection, has prognostic implications and is used to investigate the possibility of patients having underlying genetic disorders (Lynch syndrome). NICE recommends that all CRC patients should be offered MMR/MSI testing. Immunohistochemistry is commonly used to assess MMR status with subsequent molecular testing performed as required. This incurs significant extra costs and requires additional resources. The introduction of automated methods that can predict MSI or MMR status from a target image could substantially reduce the cost associated with MMR testing. Unlike previous studies on MSI prediction involving training a CNN using coarse labels (MSI vs Microsatellite Stable (MSS)), we have utilised fine-grain MMR labels for training purposes. In this paper, we present our work on predicting MSI status in a two-stage process using a single target slide either stained with CK8/18 or H&E. First, we trained a multi-headed convolutional neural network model where each head was responsible for predicting one of the MMR protein expressions. To this end, we performed the registration of MMR stained slides to the target slide as a pre-processing step. In the second stage, statistical features computed from the MMR prediction maps were used for the final MSI prediction. Our results demonstrated that MSI classification can be improved by incorporating fine-grained MMR labels in comparison to the previous approaches in which only coarse labels were utilised.
IVFeb 21, 2022
Deep Feature based Cross-slide RegistrationRuqayya Awan, Shan E Ahmed Raza, Johannes Lotz et al.
Cross-slide image analysis provides additional information by analysing the expression of different biomarkers as compared to a single slide analysis. These biomarker stained slides are analysed side by side, revealing unknown relations between them. During the slide preparation, a tissue section may be placed at an arbitrary orientation as compared to other sections of the same tissue block. The problem is compounded by the fact that tissue contents are likely to change from one section to the next and there may be unique artefacts on some of the slides. This makes registration of each section to a reference section of the same tissue block an important pre-requisite task before any cross-slide analysis. We propose a deep feature based registration (DFBR) method which utilises data-driven features to estimate the rigid transformation. We adopted a multi-stage strategy for improving the quality of registration. We also developed a visualisation tool to view registered pairs of WSIs at different magnifications. With the help of this tool, one can apply a transformation on the fly without the need to generate transformed source WSI in a pyramidal form. We compared the performance of data-driven features with that of hand-crafted features on the COMET dataset. Our approach can align the images with low registration errors. Generally, the success of non-rigid registration is dependent on the quality of rigid registration. To evaluate the efficacy of the DFBR method, the first two steps of the ANHIR winner's framework are replaced with our DFBR to register challenge provided image pairs. The modified framework produces comparable results to that of challenge winning team.
IVJun 24, 2021
Comparison of Consecutive and Re-stained Sections for Image Registration in HistopathologyJohannes Lotz, Nick Weiss, Jeroen van der Laak et al.
Purpose: In digital histopathology, virtual multi-staining is important for diagnosis and biomarker research. Additionally, it provides accurate ground-truth for various deep-learning tasks. Virtual multi-staining can be obtained using different stains for consecutive sections or by re-staining the same section. Both approaches require image registration to compensate tissue deformations, but little attention has been devoted to comparing their accuracy. Approach: We compare variational image registration of consecutive and re-stained sections and analyze the effect of the image resolution which influences accuracy and required computational resources. We present a new hybrid dataset of re-stained and consecutive sections (HyReCo, 81 slide pairs, approx. 3000 landmarks) that we made publicly available and compare its image registration results to the automatic non-rigid histological image registration (ANHIR) challenge data (230 consecutive slide pairs). Results: We obtain a median landmark error after registration of 7.1 μm (HyReCo) and 16.0 μm (ANHIR) between consecutive sections. Between re-stained sections, the median registration error is 2.3 μm and 0.9 μm in the two subsets of the HyReCo dataset. We observe that deformable registration leads to lower landmark errors than affine registration in both cases, though the effect is smaller in re-stained sections. Conclusion: Deformable registration of consecutive and re-stained sections is a valuable tool for the joint analysis of different stains. Significance: While the registration of re-stained sections allows nucleus-level alignment which allows for a direct analysis of interacting biomarkers, consecutive sections only allow the transfer of region-level annotations. The latter can be achieved at low computational cost using coarser image resolutions.
IVMar 17, 2020
Virtual staining for mitosis detection in Breast HistopathologyCaner Mercan, Germonda Reijnen-Mooij, David Tellez Martin et al.
We propose a virtual staining methodology based on Generative Adversarial Networks to map histopathology images of breast cancer tissue from H&E stain to PHH3 and vice versa. We use the resulting synthetic images to build Convolutional Neural Networks (CNN) for automatic detection of mitotic figures, a strong prognostic biomarker used in routine breast cancer diagnosis and grading. We propose several scenarios, in which CNN trained with synthetically generated histopathology images perform on par with or even better than the same baseline model trained with real images. We discuss the potential of this application to scale the number of training samples without the need for manual annotations.
CVMar 28, 2019
Robust, fast and accurate: a 3-step method for automatic histological image registrationJohannes Lotz, Nick Weiss, Stefan Heldmann
We present a 3-step registration pipeline for differently stained histological serial sections that consists of 1) a robust pre-alignment, 2) a parametric registration computed on coarse resolution images, and 3) an accurate nonlinear registration. In all three steps the NGF distance measure is minimized with respect to an increasingly flexible transformation. We apply the method in the ANHIR image registration challenge and evaluate its performance on the training data. The presented method is robust (error reduction in 99.6% of the cases), fast (runtime 4 seconds) and accurate (median relative target registration error 0.19%).
CVAug 17, 2018
Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as reference standardWouter Bulten, Péter Bándi, Jeffrey Hoven et al.
Prostate cancer (PCa) is graded by pathologists by examining the architectural pattern of cancerous epithelial tissue on hematoxylin and eosin (H&E) stained slides. Given the importance of gland morphology, automatically differentiating between glandular epithelial tissue and other tissues is an important prerequisite for the development of automated methods for detecting PCa. We propose a new method, using deep learning, for automatically segmenting epithelial tissue in digitized prostatectomy slides. We employed immunohistochemistry (IHC) to render the ground truth less subjective and more precise compared to manual outlining on H&E slides, especially in areas with high-grade and poorly differentiated PCa. Our dataset consisted of 102 tissue blocks, including both low and high grade PCa. From each block a single new section was cut, stained with H&E, scanned, restained using P63 and CK8/18 to highlight the epithelial structure, and scanned again. The H&E slides were co-registered to the IHC slides. On a subset of the IHC slides we applied color deconvolution, corrected stain errors manually, and trained a U-Net to perform segmentation of epithelial structures. Whole-slide segmentation masks generated by the IHC U-Net were used to train a second U-Net on H&E. Our system makes precise cell-level segmentations and segments both intact glands as well as individual (tumor) epithelial cells. We achieved an F1-score of 0.895 on a hold-out test set and 0.827 on an external reference set from a different center. We envision this segmentation as being the first part of a fully automated prostate cancer detection and grading pipeline.