CL AIDec 28, 2025

LENS: LLM-Enabled Narrative Synthesis for Mental Health by Aligning Multimodal Sensing with Language Models

Wenxuan Xu, Arvind Pillai, Subigya Nepal, Amanda C Collins, Daniel M Mackin, Michael V Heinz, Tess Z Griffin, Nicholas C Jacobson, Andrew Campbell

arXiv:2512.23025v14.91 citationsh-index: 18

Originality Incremental advance

AI Analysis

This addresses the problem of interpreting complex sensor data for mental health professionals, though it appears incremental as it builds on existing LLM and sensing technologies.

The paper tackles the challenge of translating multimodal health sensor data into natural language for mental health assessment by introducing LENS, a framework that aligns sensor data with language models to generate clinically grounded narratives. Results show LENS outperforms baselines on NLP metrics and symptom-severity accuracy, with a user study indicating its narratives are comprehensive and clinically meaningful.

Multimodal health sensing offers rich behavioral signals for assessing mental health, yet translating these numerical time-series measurements into natural language remains challenging. Current LLMs cannot natively ingest long-duration sensor streams, and paired sensor-text datasets are scarce. To address these challenges, we introduce LENS, a framework that aligns multimodal sensing data with language models to generate clinically grounded mental-health narratives. LENS first constructs a large-scale dataset by transforming Ecological Momentary Assessment (EMA) responses related to depression and anxiety symptoms into natural-language descriptions, yielding over 100,000 sensor-text QA pairs from 258 participants. To enable native time-series integration, we train a patch-level encoder that projects raw sensor signals directly into an LLM's representation space. Our results show that LENS outperforms strong baselines on standard NLP metrics and task-specific measures of symptom-severity accuracy. A user study with 13 mental-health professionals further indicates that LENS-produced narratives are comprehensive and clinically meaningful. Ultimately, our approach advances LLMs as interfaces for health sensing, providing a scalable path toward models that can reason over raw behavioral signals and support downstream clinical decision-making.

View on arXiv PDF

Similar