CLOct 24, 2022

Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models

Syrielle Montariol, Arij Riabi, Djamé Seddah

arXiv:2210.13029v224.4304 citationsh-index: 26Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of detecting hate speech across different languages for AI and social media applications, but it is incremental as it builds on existing transfer learning methods with auxiliary tasks.

The paper tackles the challenge of zero-shot cross-lingual transfer for hate speech detection, which is difficult due to linguistic and cultural gaps, by proposing training on multilingual auxiliary tasks like sentiment analysis and named entity recognition, resulting in improved model performance across languages.

Zero-shot cross-lingual transfer learning has been shown to be highly challenging for tasks involving a lot of linguistic specificities or when a cultural gap is present between languages, such as in hate speech detection. In this paper, we highlight this limitation for hate speech detection in several domains and languages using strict experimental settings. Then, we propose to train on multilingual auxiliary tasks -- sentiment analysis, named entity recognition, and tasks relying on syntactic information -- to improve zero-shot transfer of hate speech detection models across languages. We show how hate speech detection models benefit from a cross-lingual knowledge proxy brought by auxiliary tasks fine-tuning and highlight these tasks' positive impact on bridging the hate speech linguistic and cultural gap between languages.

View on arXiv PDF Code

Similar