HCOct 14, 2021

Is that a Duiker or Dik Dik Next to the Giraffe? Impacts of Uncertainty on Classification Efficiency in Citizen Science

Vinod Kumar Ahuja, Holly K. Rosser, Andrea Grover

arXiv:2110.07750v1

Originality Synthesis-oriented

AI Analysis

This work addresses quality control challenges in citizen science by identifying factors that reduce classification efficiency, though it is incremental as it builds on existing methods for analyzing uncertainty.

The study investigated how image complexity and quality issues affect classification efficiency and consensus in citizen science projects, finding that different content types require tailored consensus measures and handling strategies.

Quality control is an ongoing concern in citizen science that is often managed by replication to consensus in online tasks such as image classification. Numerous factors can lead to disagreement, including image quality problems, interface specifics, and the complexity of the content itself. We conducted trace ethnography with statistical and qualitative analyses of six Snapshot Safari projects to understand the content characteristics that can lead to uncertainty and low consensus. This study contributes content categorization based on aggregate classifications to characterize image complexity, with analysis that confirms that the categories impact classification efficiency, and an inductively generated set of additional image quality issues that also impact volunteers' ability to confidently classify content. The results suggest that different conceptualizations and measures of consensus may be needed for different types of content, and aggregate responses offer a way to identify content that needs different handling when complexity cannot be determined $a$ $priori$.

View on arXiv PDF

Similar