LG AI CL CY HCMar 13, 2025

MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance

Jia Xu, Tianyi Wei, Bojian Hou, Patryk Orzechowski, Shu Yang, Ruochen Jin, Rachael Paulbeck, Joost Wagenaar, George Demiris, Li Shen

arXiv:2503.13509v236 citationsh-index: 13Has CodeKDD

Originality Synthesis-oriented

AI Analysis

This provides a curated resource for researchers working on AI-driven mental health support, though it is incremental as it focuses on dataset creation rather than novel methods.

The authors introduced MentalChat16K, a benchmark dataset combining synthetic and anonymized real transcripts for conversational mental health assistance, covering conditions like depression and anxiety to facilitate development of large language models in this domain.

We introduce MentalChat16K, an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts from interventions between Behavioral Health Coaches and Caregivers of patients in palliative or hospice care. Covering a diverse range of conditions like depression, anxiety, and grief, this curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance. By providing a high-quality resource tailored to this critical domain, MentalChat16K aims to advance research on empathetic, personalized AI solutions to improve access to mental health support services. The dataset prioritizes patient privacy, ethical considerations, and responsible data usage. MentalChat16K presents a valuable opportunity for the research community to innovate AI technologies that can positively impact mental well-being. The dataset is available at https://huggingface.co/datasets/ShenLab/MentalChat16K and the code and documentation are hosted on GitHub at https://github.com/ChiaPatricia/MentalChat16K.

View on arXiv PDF Code

Similar