Degree based Classification of Harmful Speech using Twitter Data
This work addresses harmful speech classification for social media moderation, but it is incremental as it builds on existing methods with a new dataset.
The paper tackles the problem of classifying harmful speech on social media by creating an ontological classification based on degree of hateful intent and using it to annotate Twitter data, resulting in a new dataset and a supervised classification system for recognizing harmful speech classes.
Harmful speech has various forms and it has been plaguing the social media in different ways. If we need to crackdown different degrees of hate speech and abusive behavior amongst it, the classification needs to be based on complex ramifications which needs to be defined and hold accountable for, other than racist, sexist or against some particular group and community. This paper primarily describes how we created an ontological classification of harmful speech based on degree of hateful intent, and used it to annotate twitter data accordingly. The key contribution of this paper is the new dataset of tweets we created based on ontological classes and degrees of harmful speech found in the text. We also propose supervised classification system for recognizing these respective harmful speech classes in the texts hence.