LGSPAug 21, 2025

Learning ECG Representations via Poly-Window Contrastive Learning

Stanford
arXiv:2508.15225v11 citationsh-index: 182025 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)
Originality Incremental advance
AI Analysis

This addresses the problem of efficient self-supervised learning for ECG analysis in medical AI, though it appears incremental as it builds on existing contrastive learning methods with a specific temporal enhancement.

The paper tackles the problem of limited annotated ECG data for cardiovascular disease diagnosis by proposing a poly-window contrastive learning framework that extracts multiple temporal windows from ECG signals to learn robust representations. The method outperforms conventional two-view approaches on the PTB-XL dataset, achieving higher AUROC (0.891 vs. 0.888) and F1 scores (0.680 vs. 0.679) while reducing pre-training epochs by up to 4x and total computation time by 14.8%.

Electrocardiogram (ECG) analysis is foundational for cardiovascular disease diagnosis, yet the performance of deep learning models is often constrained by limited access to annotated data. Self-supervised contrastive learning has emerged as a powerful approach for learning robust ECG representations from unlabeled signals. However, most existing methods generate only pairwise augmented views and fail to leverage the rich temporal structure of ECG recordings. In this work, we present a poly-window contrastive learning framework. We extract multiple temporal windows from each ECG instance to construct positive pairs and maximize their agreement via statistics. Inspired by the principle of slow feature analysis, our approach explicitly encourages the model to learn temporally invariant and physiologically meaningful features that persist across time. We validate our approach through extensive experiments and ablation studies on the PTB-XL dataset. Our results demonstrate that poly-window contrastive learning consistently outperforms conventional two-view methods in multi-label superclass classification, achieving higher AUROC (0.891 vs. 0.888) and F1 scores (0.680 vs. 0.679) while requiring up to four times fewer pre-training epochs (32 vs. 128) and 14.8% in total wall clock pre-training time reduction. Despite processing multiple windows per sample, we achieve a significant reduction in the number of training epochs and total computation time, making our method practical for training foundational models. Through extensive ablations, we identify optimal design choices and demonstrate robustness across various hyperparameters. These findings establish poly-window contrastive learning as a highly efficient and scalable paradigm for automated ECG analysis and provide a promising general framework for self-supervised representation learning in biomedical time-series data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes