Automated speech-based screening of depression using deep convolutional neural networks
This work addresses the need for objective and accessible screening tools for depression, though it is incremental as it builds on existing deep learning methods.
The paper tackled automated depression detection from speech using convolutional neural networks on spectrograms, achieving a baseline accuracy of 77% on a dataset of 2568 voice samples from 107 individuals.
Early detection and treatment of depression is essential in promoting remission, preventing relapse, and reducing the emotional burden of the disease. Current diagnoses are primarily subjective, inconsistent across professionals, and expensive for individuals who may be in urgent need of help. This paper proposes a novel approach to automated depression detection in speech using convolutional neural network (CNN) and multipart interactive training. The model was tested using 2568 voice samples obtained from 77 non-depressed and 30 depressed individuals. In experiment conducted, data were applied to residual CNNs in the form of spectrograms, images auto-generated from audio samples. The experimental results obtained using different ResNet architectures gave a promising baseline accuracy reaching 77%.