Fairness-Aware Multi-Group Target Detection in Online Discussion
This addresses fairness in content moderation for social media platforms, but is incremental as it builds on existing target-group detection methods.
The paper tackled the problem of detecting which demographic groups online content targets, focusing on fairness in toxicity detection, and showed that their fairness-aware multi-group approach reduces bias and outperforms existing baselines.
Target-group detection is the task of detecting which group(s) a piece of content is ``directed at or about''. Applications include targeted marketing, content recommendation, and group-specific content assessment. Key challenges include: 1) that a single post may target multiple groups; and 2) ensuring consistent detection accuracy across groups for fairness. In this work, we investigate fairness implications of target-group detection in the context of toxicity detection, where the perceived harm of a social media post often depends on which group(s) it targets. Because toxicity is highly contextual, language that appears benign in general can be harmful when targeting specific demographic groups. We show our {\em fairness-aware multi-group target detection} approach both reduces bias across groups and shows strong predictive performance, surpassing existing fairness-aware baselines. To enable reproducibility and spur future work, we share our code online.