Diffusion-based Conditional ECG Generation with Structured State Space Models
This work addresses privacy issues in sensitive health data distribution for medical applications, but it is incremental as it combines two existing technologies (diffusion models and structured state space models) for a specific domain.
The authors tackled the problem of generating synthetic 12-lead electrocardiograms (ECGs) conditioned on over 70 ECG statements to address privacy concerns in health data distribution, by combining diffusion models with structured state space models (SSSD-ECG). The result showed that SSSD-ECG clearly outperformed GAN-based competitors in evaluations using pretrained classifiers and classifiers trained only on synthetic data, with a clinical Turing test demonstrating high quality across a wide range of conditions.
Synthetic data generation is a promising solution to address privacy issues with the distribution of sensitive health data. Recently, diffusion models have set new standards for generative models for different data modalities. Also very recently, structured state space models emerged as a powerful modeling paradigm to capture long-term dependencies in time series. We put forward SSSD-ECG, as the combination of these two technologies, for the generation of synthetic 12-lead electrocardiograms conditioned on more than 70 ECG statements. Due to a lack of reliable baselines, we also propose conditional variants of two state-of-the-art unconditional generative models. We thoroughly evaluate the quality of the generated samples, by evaluating pretrained classifiers on the generated data and by evaluating the performance of a classifier trained only on synthetic data, where SSSD-ECG clearly outperforms its GAN-based competitors. We demonstrate the soundness of our approach through further experiments, including conditional class interpolation and a clinical Turing test demonstrating the high quality of the SSSD-ECG samples across a wide range of conditions.