SE CYMar 11, 2016

Gold Standard for Expert Ranking: A Survey on the XWiki Dataset

arXiv:1603.03809v11 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for reliable evaluation benchmarks in expert recommendation systems, but it is incremental as it focuses on validating a specific dataset without introducing new methods.

The paper tackled the problem of creating a gold standard for evaluating automated expert ranking systems in Requirements Engineering by conducting a survey where external participants ranked discussion participants, and concluded that a reasonable gold standard was obtained, though lacking total correctness, with the observation that reliable subjects produced less ordered rankings.

We are designing an automated technique to find and recommend experts for helping in Requirements Engineering tasks, which can be done by ranking the available people by level of expertise. For evaluating the correctness of the rankings produced by the automated technique, we want to compare them to a gold standard. In this work, we ask external people to look at a set of discussions and to rank their participants, before to evaluate the reliability of these rankings to serve as a gold standard. We describe the setting and running of this survey, the method used to build the gold standard from the rankings of the subjects, and the analysis of the results to obtain and validate this gold standard. Through the analysis of the results, we conclude that we obtained a reasonable gold standard although we lack evidences to support its total correctness. We also made the interesting observation that the most reliable subjects build the least ordered rankings (i.e. has few ranks with several people per rank), which goes against the usual assumptions of Information Retrieval measures.

View on arXiv PDF

Similar