Automatic Detection of Online Jihadist Hate Speech
This addresses the need for monitoring extremist content online, but it is incremental as it applies existing NLP and ML techniques to a specific domain.
The paper tackles the problem of automatically detecting online jihadist hate speech, achieving over 80% accuracy by training on a corpus of 45,000 Twitter messages.
We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss examples of use.