SS-DPPN: A self-supervised dual-path foundation model for the generalizable cardiac audio representation
This addresses the scarcity of expert-annotated data for cardiovascular disease diagnosis, offering a scalable solution for physiological signal analysis, though it is incremental as it builds on existing self-supervised and metric-learning techniques.
The paper tackled the problem of automated phonocardiogram analysis by proposing SS-DPPN, a self-supervised foundation model that processes unlabeled cardiac audio data, achieving state-of-the-art performance on four benchmarks and demonstrating data efficiency with a three-fold reduction in labeled data.
The automated analysis of phonocardiograms is vital for the early diagnosis of cardiovascular disease, yet supervised deep learning is often constrained by the scarcity of expert-annotated data. In this paper, we propose the Self-Supervised Dual-Path Prototypical Network (SS-DPPN), a foundation model for cardiac audio representation and classification from unlabeled data. The framework introduces a dual-path contrastive learning based architecture that simultaneously processes 1D waveforms and 2D spectrograms using a novel hybrid loss. For the downstream task, a metric-learning approach using a Prototypical Network was used that enhances sensitivity and produces well-calibrated and trustworthy predictions. SS-DPPN achieves state-of-the-art performance on four cardiac audio benchmarks. The framework demonstrates exceptional data efficiency with a fully supervised model on three-fold reduction in labeled data. Finally, the learned representations generalize successfully across lung sound classification and heart rate estimation. Our experiments and findings validate SS-DPPN as a robust, reliable, and scalable foundation model for physiological signals.