IVAICVAug 8, 2025

impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction

arXiv:2508.09195v1h-index: 2Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of incomplete medical data for improving prognostic models in cancer research, representing a strong domain-specific advancement.

The paper tackles the problem of missing modalities in multimodal cancer survival prediction by introducing impuTMAE, a transformer-based method that imputes missing data during pre-training, achieving state-of-the-art performance in glioma survival prediction on TCGA-GBM/LGG and BraTS datasets.

The use of diverse modalities, such as omics, medical images, and clinical data can not only improve the performance of prognostic models but also deepen an understanding of disease mechanisms and facilitate the development of novel treatment approaches. However, medical data are complex, often incomplete, and contains missing modalities, making effective handling its crucial for training multimodal models. We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy. It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches. Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets, integrating five modalities: genetic (DNAm, RNA-seq), imaging (MRI, WSI), and clinical data. By addressing missing data during pre-training and enabling efficient resource utilization, impuTMAE surpasses prior multimodal approaches, achieving state-of-the-art performance in glioma patient survival prediction. Our code is available at https://github.com/maryjis/mtcp

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes