ASCLDec 26, 2024

Robust Speech and Natural Language Processing Models for Depression Screening

arXiv:2412.19072v15 citationsh-index: 63SPMB
Originality Synthesis-oriented
AI Analysis

This work addresses remote depression screening for patients, though it appears incremental as it applies existing transfer learning methods to a new domain-specific dataset.

The authors tackled depression screening by developing two deep learning models (acoustic and NLP-based) using transfer learning on a dataset of 11,000 users, achieving AUC scores of 0.80 or higher on unseen data with no speaker overlap.

Depression is a global health concern with a critical need for increased patient screening. Speech technology offers advantages for remote screening but must perform robustly across patients. We have described two deep learning models developed for this purpose. One model is based on acoustics; the other is based on natural language processing. Both models employ transfer learning. Data from a depression-labeled corpus in which 11,000 unique users interacted with a human-machine application using conversational speech is used. Results on binary depression classification have shown that both models perform at or above AUC=0.80 on unseen data with no speaker overlap. Performance is further analyzed as a function of test subset characteristics, finding that the models are generally robust over speaker and session variables. We conclude that models based on these approaches offer promise for generalized automated depression screening.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes