CL LG MLSep 1, 2020

Generalisation of Cyberbullying Detection

arXiv:2009.01046v10.88 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of inconsistent cyberbullying detection for online communities, but it is incremental as it builds on existing datasets and methods.

The paper investigates how varying definitions of cyberbullying across datasets affect the portability and generalization of classifiers, analyzing ensemble models to understand their interactions.

Cyberbullying is a problem in today's ubiquitous online communities. Filtering it out of online conversations has proven a challenge, and efforts have led to the creation of many different datasets, all offered as resources to train classifiers. Through these datasets, we will explore the variety of definitions of cyberbullying behaviors and the impact of these differences on the portability of one classifier to another community. By analyzing the similarities between datasets, we also gain insight on the generalization power of the classifiers trained from them. A study of ensemble models combining these classifiers will help us understand how they interact with each other.

View on arXiv PDF

Similar