LGDec 26, 2024

Context-Aware Deep Learning for Multi Modal Depression Detection

arXiv:2412.19209v1107 citationsh-index: 89Has CodeICASSP
Originality Incremental advance
AI Analysis

This addresses depression detection for clinical applications, but it is incremental as it builds on existing multi-modal methods with specific improvements.

The study tackled automated depression detection from clinical interviews using multi-modal machine learning, achieving state-of-the-art performance for audio and text modalities individually and in combination.

In this study, we focus on automated approaches to detect depression from clinical interviews using multi-modal machine learning (ML). Our approach differentiates from other successful ML methods such as context-aware analysis through feature engineering and end-to-end deep neural networks for depression detection utilizing the Distress Analysis Interview Corpus. We propose a novel method that incorporates: (1) pre-trained Transformer combined with data augmentation based on topic modelling for textual data; and (2) deep 1D convolutional neural network (CNN) for acoustic feature modeling. The simulation results demonstrate the effectiveness of the proposed method for training multi-modal deep learning models. Our deep 1D CNN and Transformer models achieved state-of-the-art performance for audio and text modalities respectively. Combining them in a multi-modal framework also outperforms state-of-the-art for the combined setting. Code available at https://github.com/genandlam/multi-modal-depression-detection

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes