LGSep 4, 2025

Synthetic Survival Data Generation for Heart Failure Prognosis Using Deep Generative Models

arXiv:2509.04245v21 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This addresses data sharing barriers for heart failure researchers, though it is incremental as it applies existing methods to a specific domain.

The researchers tackled the problem of limited access to heart failure datasets due to privacy concerns by generating synthetic data using deep generative models, finding that SurvivalGAN and TabDDPM produced high-fidelity data with C-indices of 0.71-0.76 for survival prediction, closely matching real data performance.

Background: Heart failure (HF) research is constrained by limited access to large, shareable datasets due to privacy regulations and institutional barriers. Synthetic data generation offers a promising solution to overcome these challenges while preserving patient confidentiality. Methods: We generated synthetic HF datasets from institutional data comprising 12,552 unique patients using five deep learning models: tabular variational autoencoder (TVAE), normalizing flow, ADSGAN, SurvivalGAN, and tabular denoising diffusion probabilistic models (TabDDPM). We comprehensively evaluated synthetic data utility through statistical similarity metrics, survival prediction using machine learning and privacy assessments. Results: SurvivalGAN and TabDDPM demonstrated high fidelity to the original dataset, exhibiting similar variable distributions and survival curves after applying histogram equalization. SurvivalGAN (C-indices: 0.71-0.76) and TVAE (C-indices: 0.73-0.76) achieved the strongest performance in survival prediction evaluation, closely matched real data performance (C-indices: 0.73-0.76). Privacy evaluation confirmed protection against re-identification attacks. Conclusions: Deep learning-based synthetic data generation can produce high-fidelity, privacy-preserving HF datasets suitable for research applications. This publicly available synthetic dataset addresses critical data sharing barriers and provides a valuable resource for advancing HF research and predictive modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes