CL LG SIMay 19, 2025

Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-Berdiñas, Carlos Eiras-Franco

arXiv:2505.17068v1h-index: 22IJCNN

Originality Synthesis-oriented

AI Analysis

This addresses toxicity issues for online health platforms, but it is incremental as it applies an existing method to a new domain.

The paper tackled the problem of toxicity in health-related online discussions by predicting where users might interact toxically using a Collaborative Filtering-based Machine Learning method, achieving over 80% predictive performance to prevent conflicts between users and subcommunities.

In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.

View on arXiv PDF

Similar