Enriching Abusive Language Detection with Community Context
This addresses the issue of censorship in productive conversations for marginalized communities, representing an incremental improvement by adding community-level features to existing methods.
The paper tackled the problem of abusive language detection models misclassifying benign or empowering pejorative expressions used by marginalized groups, and found that incorporating community context improves accuracy and reduces false positive rates in state-of-the-art classifiers.
Uses of pejorative expressions can be benign or actively empowering. When models for abuse detection misclassify these expressions as derogatory, they inadvertently censor productive conversations held by marginalized groups. One way to engage with non-dominant perspectives is to add context around conversations. Previous research has leveraged user- and thread-level features, but it often neglects the spaces within which productive conversations take place. Our paper highlights how community context can improve classification outcomes in abusive language detection. We make two main contributions to this end. First, we demonstrate that online communities cluster by the nature of their support towards victims of abuse. Second, we establish how community context improves accuracy and reduces the false positive rates of state-of-the-art abusive language classifiers. These findings suggest a promising direction for context-aware models in abusive language research.