CLJun 5, 2017

One-step and Two-step Classification for Abusive Language Detection on Twitter

arXiv:1706.01206v11152 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of detecting abusive language, specifically sexism and racism, for social media platforms, but it is incremental as it compares existing methods on a new dataset.

The paper tackled abusive language detection on Twitter by comparing a one-step multi-class classification approach with a two-step method, achieving F-measure scores of 0.827 and 0.824 respectively on a dataset of 20,000 tweets.

Automatic abusive language detection is a difficult but important task for online social media. Our research explores a two-step approach of performing classification on abusive language and then classifying into specific types and compares it with one-step approach of doing one multi-class classification for detecting sexist and racist languages. With a public English Twitter corpus of 20 thousand tweets in the type of sexism and racism, our approach shows a promising performance of 0.827 F-measure by using HybridCNN in one-step and 0.824 F-measure by using logistic regression in two-steps.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes