CLJul 11, 2022

CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts

Muskan Garg, Chandni Saxena, Veena Krishnan, Ruchi Joshi, Sriparna Saha, Vijay Mago, Bonnie J Dorr

arXiv:2207.04674v131.5600 citationsh-index: 45Has Code

Originality Synthesis-oriented

AI Analysis

This provides a new annotated corpus for researchers analyzing mental health causes in social media, but it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of causal analysis of mental health issues in social media by introducing the CAMS dataset, which includes 3155 annotated Reddit posts and 1896 re-annotated instances, and they demonstrated that a Logistic Regression model outperforms a CNN-LSTM model by 4.9% accuracy.

Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this task of causal analysis. We demonstrate the efficacy of our schema on two different datasets: (i) crawling and annotating 3155 Reddit posts and (ii) re-annotating the publicly available SDCNL dataset of 1896 instances for interpretable causal analysis. We further combine these into the CAMS dataset and make this resource publicly available along with associated source code: https://github.com/drmuskangarg/CAMS. We present experimental results of models learned from CAMS dataset and demonstrate that a classic Logistic Regression model outperforms the next best (CNN-LSTM) model by 4.9\% accuracy.

View on arXiv PDF Code

Similar