SICYLGFeb 10, 2022

Characterizing, Detecting, and Predicting Online Ban Evasion

arXiv:2202.05257v114 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the issue of malicious users circumventing bans on platforms like Wikipedia, which is incremental as it builds on prior research on online deception but focuses specifically on ban evasion.

This paper tackles the problem of ban evasion on online platforms by conducting the first data-driven study, using a novel dataset of 8,551 ban evasion pairs from Wikipedia to analyze behavioral similarities and train classifiers for detection and prediction, achieving AUC scores up to 0.85 and MRR of 0.97.

Moderators and automated methods enforce bans on malicious users who engage in disruptive behavior. However, malicious users can easily create a new account to evade such bans. Previous research has focused on other forms of online deception, like the simultaneous operation of multiple accounts by the same entities (sockpuppetry), impersonation of other individuals, and studying the effects of de-platforming individuals and communities. Here we conduct the first data-driven study of ban evasion, i.e., the act of circumventing bans on an online platform, leading to temporally disjoint operation of accounts by the same user. We curate a novel dataset of 8,551 ban evasion pairs (parent, child) identified on Wikipedia and contrast their behavior with benign users and non-evading malicious users. We find that evasion child accounts demonstrate similarities with respect to their banned parent accounts on several behavioral axes - from similarity in usernames and edited pages to similarity in content added to the platform and its psycholinguistic attributes. We reveal key behavioral attributes of accounts that are likely to evade bans. Based on the insights from the analyses, we train logistic regression classifiers to detect and predict ban evasion at three different points in the ban evasion lifecycle. Results demonstrate the effectiveness of our methods in predicting future evaders (AUC = 0.78), early detection of ban evasion (AUC = 0.85), and matching child accounts with parent accounts (MRR = 0.97). Our work can aid moderators by reducing their workload and identifying evasion pairs faster and more efficiently than current manual and heuristic-based approaches. Dataset is available https://github.com/srijankr/ban_evasion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes