Aligning Intraobserver Agreement by Transitivity
This work addresses annotation reproducibility issues for researchers and practitioners in fields like machine learning and data annotation, though it appears incremental as it builds on existing transitivity concepts.
The paper tackles the problem of measuring annotator intraobserver agreement by proposing a transitivity-based method that is less sensitive to bias compared to test-retest strategies, and it presents a representation theorem to map transitive data to a scale while addressing quadratic complexity in data collection.
Annotation reproducibility and accuracy rely on good consistency within annotators. We propose a novel method for measuring within annotator consistency or annotator Intraobserver Agreement (IA). The proposed approach is based on transitivity, a measure that has been thoroughly studied in the context of rational decision-making. The transitivity measure, in contrast with the commonly used test-retest strategy for annotator IA, is less sensitive to the several types of bias introduced by the test-retest strategy. We present a representation theorem to the effect that relative judgement data that meet transitivity can be mapped to a scale (in terms of measurement theory). We also discuss a further application of transitivity as part of data collection design for addressing the problem of the quadratic complexity of data collection of relative judgements.