Synthetic ECG Signal Generation Using Generative Neural Networks
This work addresses data scarcity and imbalance in ECG datasets for medical diagnosis models, but it is incremental as it applies existing GAN methods to a specific domain.
The paper tackled the problem of imbalanced and scarce ECG datasets for training automatic diagnosis models by evaluating five GAN variants for synthetic ECG generation, focusing on normal cardiac cycles. The results showed that all models could generate acceptable heartbeats, with BiLSTM-DC GAN and WGAN favored in visual inspections and Classic GAN achieving a 72% productivity rate, and augmentation improved classification performance significantly.
Electrocardiogram (ECG) datasets tend to be highly imbalanced due to the scarcity of abnormal cases. Additionally, the use of real patients' ECGs is highly regulated due to privacy issues. Therefore, there is always a need for more ECG data, especially for the training of automatic diagnosis machine learning models, which perform better when trained on a balanced dataset. We studied the synthetic ECG generation capability of 5 different models from the generative adversarial network (GAN) family and compared their performances, the focus being only on Normal cardiac cycles. Dynamic Time Warping (DTW), Fréchet, and Euclidean distance functions were employed to quantitatively measure performance. Five different methods for evaluating generated beats were proposed and applied. We also proposed 3 new concepts (threshold, accepted beat and productivity rate) and employed them along with the aforementioned methods as a systematic way for comparison between models. The results show that all the tested models can, to an extent, successfully mass-generate acceptable heartbeats with high similarity in morphological features, and potentially all of them can be used to augment imbalanced datasets. However, visual inspections of generated beats favors BiLSTM-DC GAN and WGAN, as they produce statistically more acceptable beats. Also, with regards to productivity rate, the Classic GAN is superior with a 72% productivity rate. We also designed a simple experiment with the state-of-the-art classifier (ECGResNet34) to show empirically that the augmentation of the imbalanced dataset by synthetic ECG signals could improve the performance of classification significantly.