How toxic is antisemitism? Potentials and limitations of automated toxicity scoring for antisemitic online content
This highlights limitations in automated toxicity scoring for antisemitic content, which is crucial for content moderation and social media research, though it is incremental as it builds on existing tools.
The study evaluated the Perspective API's ability to detect antisemitic content, finding it recognizes such content as toxic but has critical weaknesses with non-explicit forms and texts with critical stances, and that simple manipulations can bypass it.
The Perspective API, a popular text toxicity assessment service by Google and Jigsaw, has found wide adoption in several application areas, notably content moderation, monitoring, and social media research. We examine its potentials and limitations for the detection of antisemitic online content that, by definition, falls under the toxicity umbrella term. Using a manually annotated German-language dataset comprising around 3,600 posts from Telegram and Twitter, we explore as how toxic antisemitic texts are rated and how the toxicity scores differ regarding different subforms of antisemitism and the stance expressed in the texts. We show that, on a basic level, Perspective API recognizes antisemitic content as toxic, but shows critical weaknesses with respect to non-explicit forms of antisemitism and texts taking a critical stance towards it. Furthermore, using simple text manipulations, we demonstrate that the use of widespread antisemitic codes can substantially reduce API scores, making it rather easy to bypass content moderation based on the service's results.