IVAICVLGAPNov 29, 2024

Multimodal Whole Slide Foundation Model for Pathology

arXiv:2411.19666v1172 citationsh-index: 30Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of limited clinical data for rare diseases in computational pathology, offering a scalable solution for resource-limited scenarios, though it builds incrementally on existing foundation model approaches.

The authors tackled the challenge of applying foundation models to complex clinical tasks in computational pathology by developing TITAN, a multimodal whole slide foundation model pretrained on 335,645 WSIs and aligned with pathology reports and synthetic captions, which outperformed existing models in tasks like rare disease retrieval and cancer prognosis without requiring fine-tuning.

The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests (ROIs) into versatile and transferable feature representations via self-supervised learning (SSL). However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data in disease-specific cohorts, especially for rare clinical conditions. We propose TITAN, a multimodal whole slide foundation model pretrained using 335,645 WSIs via visual self-supervised learning and vision-language alignment with corresponding pathology reports and 423,122 synthetic captions generated from a multimodal generative AI copilot for pathology. Without any finetuning or requiring clinical labels, TITAN can extract general-purpose slide representations and generate pathology reports that generalize to resource-limited clinical scenarios such as rare disease retrieval and cancer prognosis. We evaluate TITAN on diverse clinical tasks and find that TITAN outperforms both ROI and slide foundation models across machine learning settings such as linear probing, few-shot and zero-shot classification, rare cancer retrieval and cross-modal retrieval, and pathology report generation.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes