UM-IU@LING at SemEval-2019 Task 6: Identifying Offensive Tweets Using BERT and SVMs
This work addresses the incremental challenge of automated hate speech detection in social media for content moderation applications.
The paper tackled the problem of identifying and categorizing hate speech in tweets for SemEval 2019 Task 6, achieving a macro F1 score of 0.8136 for detecting abusive content (ranking 3rd out of 103 submissions) and 0.5243 for identifying abuse targets (ranking 27th out of 65 submissions).
This paper describes the UM-IU@LING's system for the SemEval 2019 Task 6: OffensEval. We take a mixed approach to identify and categorize hate speech in social media. In subtask A, we fine-tuned a BERT based classifier to detect abusive content in tweets, achieving a macro F1 score of 0.8136 on the test data, thus reaching the 3rd rank out of 103 submissions. In subtasks B and C, we used a linear SVM with selected character n-gram features. For subtask C, our system could identify the target of abuse with a macro F1 score of 0.5243, ranking it 27th out of 65 submissions.