PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning
This work provides a standardized resource for researchers to develop and evaluate language-grounded models for PPG-based physiological inference, addressing the lack of text-annotated PPG data.
PulseLM introduces a large-scale PPG-text question-answering dataset with over 1 million PPG segments and 2.5 million QA pairs, enabling language-based interfaces for physiological monitoring. The dataset aggregates 16 sources into 12 tasks and establishes baseline benchmarks for multimodal PPG-aware LLMs.
Photoplethysmography (PPG) is a widely used non-invasive sensing modality for continuous cardiovascular and physiological monitoring across clinical, laboratory, and wearable settings. While existing PPG datasets support a broad range of downstream tasks, they typically provide supervision in the form of numerical measurements or task-specific labels, limiting their compatibility with language-based interfaces and multimodal foundation models. In this work, we introduce PulseLM, a large-scale PPG-text question-answering dataset that bridges raw PPG waveforms and natural language through a unified question-answering (QA) formulation. PulseLM aggregates PPG recordings from sixteen publicly available sources and harmonizes heterogeneous annotations into 12 downstream tasks. The dataset comprises over 1 million standardized 10-second PPG segments, associated with nearly 2.5 million question-answer pairs. We further define reproducible data pipeline, training, and evaluation protocols and establish baseline benchmarks using multimodal PPG-aware large language models. PulseLM provides a standardized foundation for studying language-grounded physiological inference, cross-dataset generalization, and scalable benchmarking of PPG-based multimodal models. We publicly release the dataset and code at https://huggingface.co/datasets/Manhph2211/PulseLM and https://github.com/manhph2211/PULSE-LM, respectively.