Automatic verbal aggression detection for Russian and American imageboards
This addresses the problem of monitoring aggressive behavior in anonymous online forums for community moderators, but it is incremental as it applies an existing method to new data.
The study tackled automatic detection of verbal aggression on American and Russian imageboards, achieving 88% accuracy for English messages using word2vec on a dataset of 1,802,789 messages, while Russian results were less effective.
The problem of aggression for Internet communities is rampant. Anonymous forums usually called imageboards are notorious for their aggressive and deviant behaviour even in comparison with other Internet communities. This study is aimed at studying ways of automatic detection of verbal expression of aggression for the most popular American (4chan.org) and Russian (2ch.hk) imageboards. A set of 1,802,789 messages was used for this study. The machine learning algorithm word2vec was applied to detect the state of aggression. A decent result is obtained for English (88%), the results for Russian are yet to be improved.