CVNov 20, 2024

Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

Tim Lenz, Peter Neidlinger, Marta Ligero, Georg Wölflein, Marko van Treeck, Jakob Nikolas Kather

arXiv:2411.13623v314.716 citationsh-index: 15Has CodeCVPR

Originality Incremental advance

AI Analysis

This addresses the problem of task-agnostic slide representation learning in computational pathology, offering a method that is compatible with unseen feature extractors, though it appears incremental as it builds on existing self-supervised and foundation model approaches.

The paper tackles the challenge of generating slide-level representations from pathology whole-slide images without task-specific supervision, proposing COBRA, a self-supervised method that integrates embeddings from multiple foundation models and achieves an average improvement of at least +4.4% AUC over state-of-the-art slide encoders on four public cohorts.

Representation learning of pathology whole-slide images (WSIs) has primarily relied on weak supervision with Multiple Instance Learning (MIL). This approach leads to slide representations highly tailored to a specific clinical task. Self-supervised learning (SSL) has been successfully applied to train histopathology foundation models (FMs) for patch embedding generation. However, generating patient or slide level embeddings remains challenging. Existing approaches for slide representation learning extend the principles of SSL from patch level learning to entire slides by aligning different augmentations of the slide or by utilizing multimodal data. By integrating tile embeddings from multiple FMs, we propose a new single modality SSL method in feature space that generates useful slide representations. Our contrastive pretraining strategy, called COBRA, employs multiple FMs and an architecture based on Mamba-2. COBRA exceeds performance of state-of-the-art slide encoders on four different public Clinical Protemic Tumor Analysis Consortium (CPTAC) cohorts on average by at least +4.4% AUC, despite only being pretrained on 3048 WSIs from The Cancer Genome Atlas (TCGA). Additionally, COBRA is readily compatible at inference time with previously unseen feature extractors. Code available at https://github.com/KatherLab/COBRA.

View on arXiv PDF Code

Similar