ECG-Agent: On-Device Tool-Calling Agent for ECG Multi-Turn Dialogue
This addresses the problem of limited real-world usability in ECG analysis for medical professionals or patients by enabling efficient, accurate multi-turn dialogues on devices, though it appears incremental as it builds on existing LLM and tool-calling paradigms.
The authors tackled the lack of multi-turn conversational ability, on-device efficiency, and precise ECG measurement understanding in existing models by introducing ECG-Agent, an LLM-based tool-calling agent for ECG dialogue, which outperformed baseline ECG-LLMs in response accuracy and showed on-device agents achieving comparable performance to larger ones.
Recent advances in Multimodal Large Language Models have rapidly expanded to electrocardiograms, focusing on classification, report generation, and single-turn QA tasks. However, these models fall short in real-world scenarios, lacking multi-turn conversational ability, on-device efficiency, and precise understanding of ECG measurements such as the PQRST intervals. To address these limitations, we introduce ECG-Agent, the first LLM-based tool-calling agent for multi-turn ECG dialogue. To facilitate its development and evaluation, we also present ECG-Multi-Turn-Dialogue (ECG-MTD) dataset, a collection of realistic user-assistant multi-turn dialogues for diverse ECG lead configurations. We develop ECG-Agents in various sizes, from on-device capable to larger agents. Experimental results show that ECG-Agents outperform baseline ECG-LLMs in response accuracy. Furthermore, on-device agents achieve comparable performance to larger agents in various evaluations that assess response accuracy, tool-calling ability, and hallucinations, demonstrating their viability for real-world applications.