IRDec 16, 2017

Overview of the Wikidata Vandalism Detection Task at WSDM Cup 2017

arXiv:1712.05956v1
Originality Synthesis-oriented
AI Analysis

This addresses the issue of maintaining data quality on Wikidata for users and researchers, though it is incremental as it builds on existing vandalism detection work.

The paper tackled the problem of detecting vandalism on Wikidata by recasting it as an online learning task requiring near real-time predictions, with the best-performing approach achieving a ROC-AUC of 0.947 and a PR-AUC of 0.458.

We report on the Wikidata vandalism detection task at the WSDM Cup 2017. The task received five submissions for which this paper describes their evaluation and a comparison to state of the art baselines. Unlike previous work, we recast Wikidata vandalism detection as an online learning problem, requiring participant software to predict vandalism in near real-time. The best-performing approach achieves a ROC-AUC of 0.947 at a PR-AUC of 0.458. In particular, this task was organized as a software submission task: to maximize reproducibility as well as to foster future research and development on this task, the participants were asked to submit their working software to the TIRA experimentation platform along with the source code for open source release.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes