GTLGTHOct 3, 2021

Information Elicitation Meets Clustering

arXiv:2110.00952v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of robust information elicitation in crowdsourcing, which is incremental by building on prior methods like 'surprisingly popular' and introducing clustering-based approaches.

The paper tackles the problem of aggregating subjective evaluations from crowds, where traditional methods like plurality vote and 'surprisingly popular' are not robust to strategic behavior, by proposing a new information aggregation method based on clustering; it introduces Determinant MaxImization (DMI)-clustering as a general method invariant to affine transformations and solvable in polynomial time for constant dimensions, and applies this to both multi-task and single-task settings with spectral methods.

In the setting where we want to aggregate people's subjective evaluations, plurality vote may be meaningless when a large amount of low-effort people always report "good" regardless of the true quality. "Surprisingly popular" method, picking the most surprising answer compared to the prior, handle this issue to some extent. However, it is still not fully robust to people's strategies. Here in the setting where a large number of people are asked to answer a small number of multi-choice questions (multi-task, large group), we propose an information aggregation method that is robust to people's strategies. Interestingly, this method can be seen as a rotated "surprisingly popular". It is based on a new clustering method, Determinant MaxImization (DMI)-clustering, and a key conceptual idea that information elicitation without ground-truth can be seen as a clustering problem. Of independent interest, DMI-clustering is a general clustering method that aims to maximize the volume of the simplex consisting of each cluster's mean multiplying the product of the cluster sizes. We show that DMI-clustering is invariant to any non-degenerate affine transformation for all data points. When the data point's dimension is a constant, DMI-clustering can be solved in polynomial time. In general, we present a simple heuristic for DMI-clustering which is very similar to Lloyd's algorithm for k-means. Additionally, we also apply the clustering idea in the single-task setting and use the spectral method to propose a new aggregation method that utilizes the second-moment information elicited from the crowds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes