Alexander Selivanov

h-index4

3papers

127citations

3 Papers

19.4SPJul 15

ECG-LLM: Foundation Model for ECG-Based Cardiac Reasoning

Alexander Selivanov, Friederike Jungmann, Jan Kehrer et al.

Electrocardiography (ECG) is an inexpensive, standard-of-care test for cardiac symptoms, but front-line triage often lacks immediate access to definitive imaging such as echocardiography (ECHO) or cardiac magnetic resonance (CMR). Furthermore, most existing ECGAI systems are limited to fixed diagnostic labels or automated reports, constraining their use for patient-specific clinical reasoning. To address this gap, we introduce ECG-LLM, an ECG-conditioned large language model trained across four cohorts comprising 679,112 ECG studies from 186,409 patients. Using a novel multimodal-to-language supervision strategy, ECG-LLM is trained on clinically structured question-answer pairs derived from ECG signals, clinical context, CMR, and ECHO. This unified approach enables the model to answer diverse cardiovascular questions from a 12-lead ECG alone, spanning both conventional interpretation and phenotypes not directly visible on standard ECGs. ECG-LLM successfully recovers conventional ECG measurements, such as heart rate, and strongly predicts complex CMR-derived phenotypes, including ventricular and atrial volumes and ventricular function. Crucially, it detects vital echocardiographic phenotypes, including increased LV wall thickness, aortic stenosis, and right-ventricular systolic dysfunction. On standard ECG understanding tasks, ECG-LLM matches or exceeds existing baselines for diagnostic report generation and the ECG-QA benchmark. By moving beyond fixed-label prediction, this multimodal framework provides clinically valuable, question-driven cardiovascular reasoning to support general practitioner and front-line triage decisions when specialist review is delayed.

14.5CVSep 28, 2022

Medical Image Captioning via Generative Pretrained Transformers

Alexander Selivanov, Oleg Y. Rogov, Daniil Chesakov et al.

The automatic clinical caption generation problem is referred to as proposed model combining the analysis of frontal chest X-Ray scans with structured patient information from the radiology records. We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The proposed combination of these models generates a textual summary with the essential information about pathologies found, their location, and the 2D heatmaps localizing each pathology on the original X-Ray scans. The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO. The results measured with the natural language assessment metrics prove their efficient applicability to the chest X-Ray image captioning.

8.6IVJun 24, 2025Code

Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG

Alexander Selivanov, Philip Müller, Özgün Turgut et al.

An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting electrical abnormalities in the heart. However, it cannot directly measure functional parameters, such as ventricular volumes and ejection fraction, which are crucial for assessing cardiac function. Cardiac magnetic resonance (CMR) is the gold standard for these measurements, providing detailed structural and functional insights, but is expensive and less accessible. To bridge this gap, we propose PTACL (Patient and Temporal Alignment Contrastive Learning), a multimodal contrastive learning framework that enhances ECG representations by integrating spatio-temporal information from CMR. PTACL uses global patient-level contrastive loss and local temporal-level contrastive loss. The global loss aligns patient-level representations by pulling ECG and CMR embeddings from the same patient closer together, while pushing apart embeddings from different patients. Local loss enforces fine-grained temporal alignment within each patient by contrasting encoded ECG segments with corresponding encoded CMR frames. This approach enriches ECG representations with diagnostic information beyond electrical activity and transfers more insights between modalities than global alignment alone, all without introducing new learnable weights. We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank. Compared to baseline approaches, PTACL achieves better performance in two clinically relevant tasks: (1) retrieving patients with similar cardiac phenotypes and (2) predicting CMR-derived cardiac function parameters, such as ventricular volumes and ejection fraction. Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG. The code is available at: https://github.com/alsalivan/ecgcmr