Evaluating Large Language Models for Anxiety and Depression Classification using Counseling and Psychotherapy Transcripts
This work addresses mental health diagnosis for clinicians or researchers, but it is incremental as it shows no improvement over existing methods.
The study tackled the problem of classifying anxiety and depression from long conversational transcripts by evaluating traditional machine learning and large language models, finding that state-of-the-art models did not improve classification outcomes compared to traditional methods.
We aim to evaluate the efficacy of traditional machine learning and large language models (LLMs) in classifying anxiety and depression from long conversational transcripts. We fine-tune both established transformer models (BERT, RoBERTa, Longformer) and more recent large models (Mistral-7B), trained a Support Vector Machine with feature engineering, and assessed GPT models through prompting. We observe that state-of-the-art models fail to enhance classification outcomes compared to traditional machine learning methods.