IV AI CVJan 13, 2025

A Multi-Modal Deep Learning Framework for Pan-Cancer Prognosis

Binyu Zhang, Shichao Li, Junpeng Jian, Zhu Meng, Limei Guo, Zhicheng Zhao

arXiv:2501.07016v11 citationsh-index: 7Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate and generalizable prognostic models in oncology, which can improve patient survival analysis and treatment planning, though it is incremental in combining existing modalities with novel fusion mechanisms.

The authors tackled the problem of pan-cancer prognosis by developing UMPSNet, a multi-modal deep learning framework that integrates histopathology images, genomic profiles, and textual meta-data, achieving state-of-the-art performance and demonstrating strong generalization across multiple cancer types.

Prognostic task is of great importance as it closely related to the survival analysis of patients, the optimization of treatment plans and the allocation of resources. The existing prognostic models have shown promising results on specific datasets, but there are limitations in two aspects. On the one hand, they merely explore certain types of modal data, such as patient histopathology WSI and gene expression analysis. On the other hand, they adopt the per-cancer-per-model paradigm, which means the trained models can only predict the prognostic effect of a single type of cancer, resulting in weak generalization ability. In this paper, a deep-learning based model, named UMPSNet, is proposed. Specifically, to comprehensively understand the condition of patients, in addition to constructing encoders for histopathology images and genomic expression profiles respectively, UMPSNet further integrates four types of important meta data (demographic information, cancer type information, treatment protocols, and diagnosis results) into text templates, and then introduces a text encoder to extract textual features. In addition, the optimal transport OT-based attention mechanism is utilized to align and fuse features of different modalities. Furthermore, a guided soft mixture of experts (GMoE) mechanism is introduced to effectively address the issue of distribution differences among multiple cancer datasets. By incorporating the multi-modality of patient data and joint training, UMPSNet outperforms all SOTA approaches, and moreover, it demonstrates the effectiveness and generalization ability of the proposed learning paradigm of a single model for multiple cancer types. The code of UMPSNet is available at https://github.com/binging512/UMPSNet.

View on arXiv PDF Code

Similar