HARBOR: Holistic Adaptive Risk assessment model for BehaviORal healthcare
This work addresses risk assessment for behavioral healthcare patients, but it is incremental as it builds on existing language model approaches with a new dataset and specific scoring method.
The authors tackled behavioral healthcare risk assessment by introducing HARBOR, a language model that predicts a discrete mood and risk score, achieving 69% accuracy compared to 54% for logistic regression and 29% for proprietary LLMs.
Behavioral healthcare risk assessment remains a challenging problem due to the highly multimodal nature of patient data and the temporal dynamics of mood and affective disorders. While large language models (LLMs) have demonstrated strong reasoning capabilities, their effectiveness in structured clinical risk scoring remains unclear. In this work, we introduce HARBOR, a behavioral health aware language model designed to predict a discrete mood and risk score, termed the Harbor Risk Score (HRS), on an integer scale from -3 (severe depression) to +3 (mania). We also release PEARL, a longitudinal behavioral healthcare dataset spanning four years of monthly observations from three patients, containing physiological, behavioral, and self reported mental health signals. We benchmark traditional machine learning models, proprietary LLMs, and HARBOR across multiple evaluation settings and ablations. Our results show that HARBOR outperforms classical baselines and off the shelf LLMs, achieving 69 percent accuracy compared to 54 percent for logistic regression and 29 percent for the strongest proprietary LLM baseline.