CLDec 26, 2024

Cross-Demographic Portability of Deep NLP-Based Depression Models

arXiv:2412.19070v18 citationsh-index: 63SLT
Originality Incremental advance
AI Analysis

This addresses the gap in demographic portability for behavioral health applications, showing incremental but promising results for cross-age generalization.

The study tackled the problem of generalizing deep NLP-based depression models across different age demographics, finding that a model trained on younger speakers achieved an AUC of 0.82 on unseen younger data and only modestly degraded to 0.76 on senior data, with a subset of seniors reaching 0.81.

Deep learning models are rapidly gaining interest for real-world applications in behavioral health. An important gap in current literature is how well such models generalize over different populations. We study Natural Language Processing (NLP) based models to explore portability over two different corpora highly mismatched in age. The first and larger corpus contains younger speakers. It is used to train an NLP model to predict depression. When testing on unseen speakers from the same age distribution, this model performs at AUC=0.82. We then test this model on the second corpus, which comprises seniors from a retirement community. Despite the large demographic differences in the two corpora, we saw only modest degradation in performance for the senior-corpus data, achieving AUC=0.76. Interestingly, in the senior population, we find AUC=0.81 for the subset of patients whose health state is consistent over time. Implications for demographic portability of speech-based applications are discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes