CLCYOct 12, 2021

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

arXiv:2110.05719v1706 citations
Originality Incremental advance
AI Analysis

This addresses the issue of ignoring annotator nuances in subjective labeling for researchers and practitioners in NLP and social computing, though it is incremental as it builds on existing multi-annotator modeling.

The paper tackles the problem of annotator disagreements in subjective tasks like affect and hate speech detection by proposing a multi-task model that predicts individual annotator judgments instead of aggregating labels. The approach matches or outperforms label aggregation across seven binary classification tasks and provides better uncertainty estimates that correlate with annotation disagreements.

Majority voting and averaging are common approaches employed to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases and values, especially in the case of subjective tasks such as detecting affect, aggression, and hate speech. Annotator disagreements may capture important nuances in such tasks that are often ignored while aggregating annotations to a single ground truth. In order to address this, we investigate the efficacy of multi-annotator models. In particular, our multi-task based approach treats predicting each annotators' judgements as separate subtasks, while sharing a common learned representation of the task. We show that this approach yields same or better performance than aggregating labels in the data prior to training across seven different binary classification tasks. Our approach also provides a way to estimate uncertainty in predictions, which we demonstrate better correlate with annotation disagreements than traditional methods. Being able to model uncertainty is especially useful in deployment scenarios where knowing when not to make a prediction is important.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes