Detecting Hate Speech in Social Media
This work addresses the challenge of identifying harmful content online, but it is incremental as it applies existing supervised classification methods to a new dataset.
The paper tackled the problem of detecting hate speech in social media by distinguishing it from general profanity, achieving 78% accuracy in classifying posts across three categories using lexical features like character and word n-grams.
In this paper we examine methods to detect hate speech in social media, while distinguishing this from general profanity. We aim to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose. As features, our system uses character n-grams, word n-grams and word skip-grams. We obtain results of 78% accuracy in identifying posts across three classes. Results demonstrate that the main challenge lies in discriminating profanity and hate speech from each other. A number of directions for future work are discussed.