Self-Supervised Embeddings for Detecting Individual Symptoms of Depression
This work addresses the need for reliable depression assessment systems, focusing on symptom-level detection rather than just overall diagnosis or severity, which is incremental in mental health monitoring.
The paper tackles the problem of detecting individual symptoms of depression and predicting its severity using speech input, achieving notable performance improvements with self-supervised learning embeddings compared to conventional features.
Depression, a prevalent mental health disorder impacting millions globally, demands reliable assessment systems. Unlike previous studies that focus solely on either detecting depression or predicting its severity, our work identifies individual symptoms of depression while also predicting its severity using speech input. We leverage self-supervised learning (SSL)-based speech models to better utilize the small-sized datasets that are frequently encountered in this task. Our study demonstrates notable performance improvements by utilizing SSL embeddings compared to conventional speech features. We compare various types of SSL pretrained models to elucidate the type of speech information (semantic, speaker, or prosodic) that contributes the most in identifying different symptoms. Additionally, we evaluate the impact of combining multiple SSL embeddings on performance. Furthermore, we show the significance of multi-task learning for identifying depressive symptoms effectively.