SP LGDec 13, 2022

HeartBEiT: Vision Transformer for Electrocardiogram Data Improves Diagnostic Performance at Low Sample Sizes

Akhil Vaid, Joy Jiang, Ashwin Sawant, Stamatios Lerakis, Edgar Argulian, Yuri Ahuja, Joshua Lampert, Alexander Charney, Hayit Greenspan, Benjamin Glicksberg, Jagat Narula, Girish Nadkarni

arXiv:2212.14040v14.36 citationsh-index: 65

Originality Highly original

AI Analysis

This addresses the challenge of developing accurate ECG diagnostic models when limited training data is available, which is an incremental improvement over existing methods.

The researchers tackled the problem of ECG analysis requiring large sample sizes by creating HeartBEiT, the first vision-based transformer model for ECG waveforms, which showed significantly higher diagnostic performance at low sample sizes compared to CNN architectures.

The electrocardiogram (ECG) is a ubiquitous diagnostic modality. Convolutional neural networks (CNNs) applied towards ECG analysis require large sample sizes, and transfer learning approaches result in suboptimal performance when pre-training is done on natural images. We leveraged masked image modeling to create the first vision-based transformer model, HeartBEiT, for electrocardiogram waveform analysis. We pre-trained this model on 8.5 million ECGs and then compared performance vs. standard CNN architectures for diagnosis of hypertrophic cardiomyopathy, low left ventricular ejection fraction and ST elevation myocardial infarction using differing training sample sizes and independent validation datasets. We show that HeartBEiT has significantly higher performance at lower sample sizes compared to other models. Finally, we also show that HeartBEiT improves explainability of diagnosis by highlighting biologically relevant regions of the EKG vs. standard CNNs. Thus, we present the first vision-based waveform transformer that can be used to develop specialized models for ECG analysis especially at low sample sizes.

View on arXiv PDF

Similar