Semantic Similarity Models for Depression Severity Estimation
This work addresses the challenge of rapid screening for depressive disorders using social media data, offering a computational support tool for public health systems, though it is incremental in nature.
The paper tackles the problem of estimating depression severity from social media text by developing a semantic similarity pipeline that ranks user sentences against training sentences representing depressive symptoms and severity levels, achieving a 30% improvement over state-of-the-art methods on Reddit-based benchmarks.
Depressive disorders constitute a severe public health issue worldwide. However, public health systems have limited capacity for case detection and diagnosis. In this regard, the widespread use of social media has opened up a way to access public information on a large scale. Computational methods can serve as support tools for rapid screening by exploiting this user-generated social media content. This paper presents an efficient semantic pipeline to study depression severity in individuals based on their social media writings. We select test user sentences for producing semantic rankings over an index of representative training sentences corresponding to depressive symptoms and severity levels. Then, we use the sentences from those results as evidence for predicting users' symptom severity. For that, we explore different aggregation methods to answer one of four Beck Depression Inventory (BDI) options per symptom. We evaluate our methods on two Reddit-based benchmarks, achieving 30\% improvement over state of the art in terms of measuring depression severity.