CLSep 6, 2024

Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

Louis Penafiel, Hsien-Te Kao, Isabel Erickson, David Chu, Robert McCormack, Kristina Lerman, Svitlana Volkova

arXiv:2409.04043v11.91 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work addresses the need for safer online spaces for individuals affected by eating disorders by providing a simulation-based approach to assess interventions, though it is incremental as it builds on existing simulation and LLM methods.

The researchers tackled the problem of safely testing intervention strategies for eating disorder discussions on social media by developing an LLM-driven testbed to simulate conversations, finding that civility-focused interventions improved positive sentiment while insight-resetting ones increased negative emotions, with significant biases observed across different LLM models.

Eating disorders are complex mental health conditions that affect millions of people around the world. Effective interventions on social media platforms are crucial, yet testing strategies in situ can be risky. We present a novel LLM-driven experimental testbed for simulating and assessing intervention strategies in ED-related discussions. Our framework generates synthetic conversations across multiple platforms, models, and ED-related topics, allowing for controlled experimentation with diverse intervention approaches. We analyze the impact of various intervention strategies on conversation dynamics across four dimensions: intervention type, generative model, social media platform, and ED-related community/topic. We employ cognitive domain analysis metrics, including sentiment, emotions, etc., to evaluate the effectiveness of interventions. Our findings reveal that civility-focused interventions consistently improve positive sentiment and emotional tone across all dimensions, while insight-resetting approaches tend to increase negative emotions. We also uncover significant biases in LLM-generated conversations, with cognitive metrics varying notably between models (Claude-3 Haiku $>$ Mistral $>$ GPT-3.5-turbo $>$ LLaMA3) and even between versions of the same model. These variations highlight the importance of model selection in simulating realistic discussions related to ED. Our work provides valuable information on the complex dynamics of ED-related discussions and the effectiveness of various intervention strategies.

View on arXiv PDF

Similar