When a Tweet is Actually Sexist. A more Comprehensive Classification of Different Online Harassment Categories and The Challenges in NLP
This work addresses the challenge of detecting diverse forms of online harassment for social media users, but it is incremental as it builds on existing classification efforts.
The paper tackles the problem of classifying sexism in social media by proposing a more comprehensive set of categories, such as indirect harassment and sexual harassment, and presents preliminary classification results using machine learning.
Sexism is very common in social media and makes the boundaries of freedom tighter for feminist and female users. There is still no comprehensive classification of sexism attracting natural language processing techniques. Categorizing sexism in social media in the categories of hostile or benevolent sexism are so general that simply ignores the other types of sexism happening in these media. This paper proposes a more comprehensive and in-depth categories of online harassment in social media e.g. twitter into the following categories, "Indirect harassment", "Information threat", "sexual harassment", "Physical harassment" and "Not sexist" and address the challenge of labeling them along with presenting the classification result of the categories. It is preliminary work applying machine learning to learn the concept of sexism and distinguishes itself by looking at more precise categories of sexism in social media.