AIFeb 27

SleepLM: Natural-Language Intelligence for Human Sleep

Zongzhe Xu, Zitao Shuai, Eideen Mozaffari, Ravi S. Aysola, Rajesh Kumar, Yuzhe Yang
arXiv:2602.23605v12 citationsHas Code
Originality Highly original
AI Analysis

This work addresses the need for more flexible and interpretable sleep analysis systems for researchers and clinicians, representing a novel method for a known bottleneck rather than an incremental improvement.

The authors tackled the problem of limited interpretability and generalization in sleep analysis by introducing SleepLM, a family of sleep-language foundation models that align natural language with multimodal polysomnography, resulting in outperforming state-of-the-art methods in zero-shot and few-shot learning, cross-modal retrieval, and sleep captioning.

We present SleepLM, a family of sleep-language foundation models that enable human sleep alignment, interpretation, and interaction with natural language. Despite the critical role of sleep, learning-based sleep analysis systems operate in closed label spaces (e.g., predefined stages or events) and fail to describe, query, or generalize to novel sleep phenomena. SleepLM bridges natural language and multimodal polysomnography, enabling language-grounded representations of sleep physiology. To support this alignment, we introduce a multilevel sleep caption generation pipeline that enables the curation of the first large-scale sleep-text dataset, comprising over 100K hours of data from more than 10,000 individuals. Furthermore, we present a unified pretraining objective that combines contrastive alignment, caption generation, and signal reconstruction to better capture physiological fidelity and cross-modal interactions. Extensive experiments on real-world sleep understanding tasks verify that SleepLM outperforms state-of-the-art in zero-shot and few-shot learning, cross-modal retrieval, and sleep captioning. Importantly, SleepLM also exhibits intriguing capabilities including language-guided event localization, targeted insight generation, and zero-shot generalization to unseen tasks. All code and data will be open-sourced.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes