Distilling Information Reliability and Source Trustworthiness from Digital Traces
This work addresses the challenge of content quality assessment in online knowledge repositories, offering a method to distill robust metrics from biased user data, which is incremental in improving existing evaluation systems.
The paper tackled the problem of estimating information reliability and source trustworthiness from noisy user evaluations in online platforms by proposing a temporal point process model, which accurately predicted evaluation events and provided interpretable measures on Wikipedia and Stack Overflow data.
Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions? In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on real-world data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about real-world events.