ASLGApr 9, 2021

Speech based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model

arXiv:2104.04195v118 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more detailed depression assessment in mental health diagnostics, though it is incremental as it builds on existing speech-based classification methods.

The paper tackles the problem of classifying depression severity levels from speech, moving beyond binary classification to provide more granular outcomes, and achieves a 27.47% relative improvement in Unweighted Average Recall at the session-level by using articulatory coordination features derived from vocal tract variables.

Speech based depression classification has gained immense popularity over the recent years. However, most of the classification studies have focused on binary classification to distinguish depressed subjects from non-depressed subjects. In this paper, we formulate the depression classification task as a severity level classification problem to provide more granularity to the classification outcomes. We use articulatory coordination features (ACFs) developed to capture the changes of neuromotor coordination that happens as a result of psychomotor slowing, a necessary feature of Major Depressive Disorder. The ACFs derived from the vocal tract variables (TVs) are used to train a dilated Convolutional Neural Network based depression classification model to obtain segment-level predictions. Then, we propose a Recurrent Neural Network based approach to obtain session-level predictions from segment-level predictions. We show that strengths of the segment-wise classifier are amplified when a session-wise classifier is trained on embeddings obtained from it. The model trained on ACFs derived from TVs show relative improvement of 27.47% in Unweighted Average Recall (UAR) at the session-level classification task, compared to the ACFs derived from Mel Frequency Cepstral Coefficients (MFCCs).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes