LonXplain: Lonesomeness as a Consequence of Mental Disturbance in Reddit Posts
This work addresses the need for an explainable dataset to facilitate research in loneliness detection for mental health applications, but it is incremental as it focuses on dataset creation without novel methodological advancements.
The paper tackles the problem of detecting loneliness in social media posts as an explainable binary classification task, resulting in the creation of a publicly released dataset, LonXplain, with 3,521 annotated Reddit posts and baseline classifiers.
Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors, propounding the need to investigate loneliness as a major aspect of mental disturbance. We formulate lonesomeness detection in social media posts as an explainable binary classification problem, discovering the users at-risk, suggesting the need of resilience for early control. To the best of our knowledge, there is no existing explainable dataset, i.e., one with human-readable, annotated text spans, to facilitate further research and development in loneliness detection causing mental disturbance. In this work, three experts: a senior clinical psychologist, a rehabilitation counselor, and a social NLP researcher define annotation schemes and perplexity guidelines to mark the presence or absence of lonesomeness, along with the marking of text-spans in original posts as explanation, in 3,521 Reddit posts. We expect the public release of our dataset, LonXplain, and traditional classifiers as baselines via GitHub.