Dynamic Survival Prediction using Longitudinal Images based on Transformer
This addresses the problem of early disease detection and prognosis for medical practitioners by improving on existing methods that underutilize censored data and lack interpretability, though it appears incremental as it builds on Transformer and Cox model frameworks.
The paper tackles survival prediction using longitudinal medical images by introducing SurLonFormer, a Transformer-based neural network that integrates imaging and structured data, achieving superior predictive performance in Alzheimer's disease analysis.
Survival analysis utilizing multiple longitudinal medical images plays a pivotal role in the early detection and prognosis of diseases by providing insight beyond single-image evaluations. However, current methodologies often inadequately utilize censored data, overlook correlations among longitudinal images measured over multiple time points, and lack interpretability. We introduce SurLonFormer, a novel Transformer-based neural network that integrates longitudinal medical imaging with structured data for survival prediction. Our architecture comprises three key components: a Vision Encoder for extracting spatial features, a Sequence Encoder for aggregating temporal information, and a Survival Encoder based on the Cox proportional hazards model. This framework effectively incorporates censored data, addresses scalability issues, and enhances interpretability through occlusion sensitivity analysis and dynamic survival prediction. Extensive simulations and a real-world application in Alzheimer's disease analysis demonstrate that SurLonFormer achieves superior predictive performance and successfully identifies disease-related imaging biomarkers.