CL IRJul 10, 2024

DS@GT eRisk 2024: Sentence Transformers for Social Media Risk Assessment

David Guecha, Aaryan Potdar, Anthony Miyaguchi

arXiv:2407.08008v11.03 citationsh-index: 4Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses mental health risk assessment from social media data, but it is incremental as it builds on existing methods without major breakthroughs.

The authors tackled depression symptom ranking and eating disorder severity prediction from social media posts, finding that binary classifiers performed poorly for ranking but classical models with BERT embeddings were competitive with baselines.

We present working notes for DS@GT team in the eRisk 2024 for Tasks 1 and 3. We propose a ranking system for Task 1 that predicts symptoms of depression based on the Beck Depression Inventory (BDI-II) questionnaire using binary classifiers trained on question relevancy as a proxy for ranking. We find that binary classifiers are not well calibrated for ranking, and perform poorly during evaluation. For Task 3, we use embeddings from BERT to predict the severity of eating disorder symptoms based on user post history. We find that classical machine learning models perform well on the task, and end up competitive with the baseline models. Representation of text data is crucial in both tasks, and we find that sentence transformers are a powerful tool for downstream modeling. Source code and models are available at \url{https://github.com/dsgt-kaggle-clef/erisk-2024}.

View on arXiv PDF Code

Similar