CLMay 27, 2025

Leveraging large language models and traditional machine learning ensembles for ADHD detection from narrative transcripts

Yuxin Zhu, Yuting Guo, Noah Marchuck, Abeed Sarker, Yun Wang

arXiv:2505.21324v16.72 citationsh-index: 9Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of psychiatric text classification for medical applications, but it is incremental as it combines existing methods without introducing a fundamentally new approach.

The study tackled ADHD detection from narrative transcripts by integrating an ensemble of LLaMA3, RoBERTa, and SVM models, achieving an F1 score of 0.71 and improving recall compared to individual models.

Despite rapid advances in large language models (LLMs), their integration with traditional supervised machine learning (ML) techniques that have proven applicability to medical data remains underexplored. This is particularly true for psychiatric applications, where narrative data often exhibit nuanced linguistic and contextual complexity, and can benefit from the combination of multiple models with differing characteristics. In this study, we introduce an ensemble framework for automatically classifying Attention-Deficit/Hyperactivity Disorder (ADHD) diagnosis (binary) using narrative transcripts. Our approach integrates three complementary models: LLaMA3, an open-source LLM that captures long-range semantic structure; RoBERTa, a pre-trained transformer model fine-tuned on labeled clinical narratives; and a Support Vector Machine (SVM) classifier trained using TF-IDF-based lexical features. These models are aggregated through a majority voting mechanism to enhance predictive robustness. The dataset includes 441 instances, including 352 for training and 89 for validation. Empirical results show that the ensemble outperforms individual models, achieving an F$_1$ score of 0.71 (95\% CI: [0.60-0.80]). Compared to the best-performing individual model (SVM), the ensemble improved recall while maintaining competitive precision. This indicates the strong sensitivity of the ensemble in identifying ADHD-related linguistic cues. These findings demonstrate the promise of hybrid architectures that leverage the semantic richness of LLMs alongside the interpretability and pattern recognition capabilities of traditional supervised ML, offering a new direction for robust and generalizable psychiatric text classification.

View on arXiv PDF

Similar