CLAICVMar 8, 2025

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images

arXiv:2503.06073v236 citationsh-index: 9Has Code
Originality Highly original
AI Analysis

This addresses the problem of insufficient explainability and multimodal synergy in automated ECG interpretation for clinical applications, representing a novel method for a known bottleneck.

The paper tackled the limitations of multimodal large language models (MLLMs) in ECG interpretation by introducing GEM, which unifies time series, images, and text for grounded analysis, resulting in significant improvements such as a 7.4% increase in predictive performance and 24.8% better grounding.

While recent multimodal large language models (MLLMs) have advanced automated ECG interpretation, they still face two key limitations: (1) insufficient multimodal synergy between time series signals and visual ECG representations, and (2) limited explainability in linking diagnoses to granular waveform evidence. We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation. GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations: a dual-encoder framework extracting complementary time series and image features, cross-modal alignment for effective multimodal understanding, and knowledge-guided instruction generation for generating high-granularity grounding data (ECG-Grounding) linking diagnoses to measurable parameters ($e.g.$, QRS/PR Intervals). Additionally, we propose the Grounded ECG Understanding task, a clinically motivated benchmark designed to comprehensively assess the MLLM's capability in grounded ECG understanding. Experimental results on both existing and our proposed benchmarks show GEM significantly improves predictive performance (CSN $7.4\% \uparrow$), explainability ($22.7\% \uparrow$), and grounding ($24.8\% \uparrow$), making it more suitable for real-world clinical applications. GitHub repository: https://github.com/lanxiang1017/GEM.git

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes