CYCVHCJun 27, 2023

"Is a picture of a bird a bird": Policy recommendations for dealing with ambiguity in machine vision models

arXiv:2306.15777v16 citationsh-index: 43
Originality Synthesis-oriented
AI Analysis

This addresses the issue of unrealistic single ground truth assumptions in machine learning for researchers and practitioners, though it is incremental as it builds on existing work on annotation variability.

The paper tackles the problem of inherent ambiguity in human annotations for machine vision tasks, where subjective judgments lead to multiple valid labels, and recommends best practices for handling such ambiguity in datasets.

Many questions that we ask about the world do not have a single clear answer, yet typical human annotation set-ups in machine learning assume there must be a single ground truth label for all examples in every task. The divergence between reality and practice is stark, especially in cases with inherent ambiguity and where the range of different subjective judgments is wide. Here, we examine the implications of subjective human judgments in the behavioral task of labeling images used to train machine vision models. We identify three primary sources of ambiguity arising from (i) depictions of labels in the images, (ii) raters' backgrounds, and (iii) the task definition. On the basis of the empirical results, we suggest best practices for handling label ambiguity in machine learning datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes