IRJun 3, 2018

Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collections Accurately and Affordably

arXiv:1806.00755v36 citations
Originality Incremental advance
AI Analysis

This addresses the cost and scalability issues in information retrieval for researchers and practitioners, but it is incremental as it builds on existing crowdsourcing methods.

The paper tackles the problem of building accurate and affordable IR test collections by combining expert and crowd assessors, showing that intelligent distribution of work between them yields encouraging results on two TREC collections.

Crowdsourcing offers an affordable and scalable means to collect relevance judgments for IR test collections. However, crowd assessors may show higher variance in judgment quality than trusted assessors. In this paper, we investigate how to effectively utilize both groups of assessors in partnership. We specifically investigate how agreement in judging is correlated with three factors: relevance category, document rankings, and topical variance. Based on this, we then propose two collaborative judging methods in which a portion of the document-topic pairs are assessed by in-house judges while the rest are assessed by crowd-workers. Experiments conducted on two TREC collections show encouraging results when we distribute work intelligently between our two groups of assessors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes