IRDec 19, 2017

Wikidata Vandalism Detection - The Loganberry Vandalism Detector at WSDM Cup 2017

arXiv:1712.06922v16 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the issue of false information spread in Wikidata, which is editable by anyone, by providing a detection method for the WSDM 2017 challenge, but it is incremental as it builds on existing vandalism detection approaches.

The paper tackled the problem of detecting vandalism in Wikidata by developing a system that computes a vandalism score for revisions, achieving an ROC-AUC of 0.91976 on a held-out test set.

Wikidata is the new, large-scale knowledge base of the Wikimedia Foundation. As it can be edited by anyone, entries frequently get vandalized, leading to the possibility that it might spread of falsified information if such posts are not detected. The WSDM 2017 Wiki Vandalism Detection Challenge requires us to solve this problem by computing a vandalism score denoting the likelihood that a revision corresponds to an act of vandalism and performance is measured using the ROC-AUC obtained on a held-out test set. This paper provides the details of our submission that obtained an ROC-AUC score of 0.91976 in the final evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes