ECG-aBcDe: Overcoming Model Dependence, Encoding ECG into a Universal Language for Any LLM
This addresses challenges in ECG analysis for medical applications by enabling transferability and interpretability, though it appears incremental as it builds on existing LLM and ECG encoding methods.
The paper tackled the problem of integrating ECG analysis with LLMs by developing ECG-aBcDe, a method that encodes ECG signals into a universal language for any LLM, achieving competitive performance on metrics like ROUGE-L and METEOR, with BLEU-4 improvements of 2.8 times and 3.9 times in in-dataset and cross-dataset evaluations, reaching scores of 42.58 and 30.76.
Large Language Models (LLMs) hold significant promise for electrocardiogram (ECG) analysis, yet challenges remain regarding transferability, time-scale information learning, and interpretability. Current methods suffer from model-specific ECG encoders, hindering transfer across LLMs. Furthermore, LLMs struggle to capture crucial time-scale information inherent in ECGs due to Transformer limitations. And their black-box nature limits clinical adoption. To address these limitations, we introduce ECG-aBcDe, a novel ECG encoding method that transforms ECG signals into a universal ECG language readily interpretable by any LLM. By constructing a hybrid dataset of ECG language and natural language, ECG-aBcDe enables direct fine-tuning of pre-trained LLMs without architectural modifications, achieving "construct once, use anywhere" capability. Moreover, the bidirectional convertibility between ECG and ECG language of ECG-aBcDe allows for extracting attention heatmaps from ECG signals, significantly enhancing interpretability. Finally, ECG-aBcDe explicitly represents time-scale information, mitigating Transformer limitations. This work presents a new paradigm for integrating ECG analysis with LLMs. Compared with existing methods, our method achieves competitive performance on ROUGE-L and METEOR. Notably, it delivers significant improvements in the BLEU-4, with improvements of 2.8 times and 3.9 times in in-dataset and cross-dataset evaluations, respectively, reaching scores of 42.58 and 30.76. These results provide strong evidence for the feasibility of the new paradigm.