CLAIDec 6, 2024

A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities

arXiv:2412.04942v23 citationsh-index: 70
Originality Incremental advance
AI Analysis

It addresses online hate speech for marginalized communities in the Global South, offering a privacy-preserving tool, but is incremental as it applies existing federated learning to a new domain.

The paper tackled the problem of detecting hate speech for marginalized communities in low-resource languages by releasing a new dataset and proposing a federated learning approach, achieving effective performance across different target groups with privacy preservation.

Hate speech online remains an understudied issue for marginalized communities, particularly in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from online hate speech by filtering offensive content in their native languages. Our contributions are twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising multiple target groups and low-resource languages, curated by experienced data collectors; 2) we propose a few-shot hate speech detection approach based on federated learning (FL), a privacy-preserving method for collaboratively training a central model that exhibits robustness when tackling different target groups and languages. By keeping training local to user devices, we ensure data privacy while leveraging the collective learning benefits of FL. Furthermore, we explore personalized client models tailored to specific target groups and evaluate their performance. Our findings indicate the overall effectiveness of FL across different target groups, and point to personalization as a promising direction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes