CLNov 26, 2025

Tracing How Annotators Think: Augmenting Preference Judgments with Reading Processes

arXiv:2511.21912v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving annotation reliability and understanding disagreement in subjective NLP tasks for researchers and practitioners, though it is incremental as it builds on existing annotation frameworks by adding cognitive insights.

The paper tackled the problem of understanding annotator decision-making in preference judgments by capturing their reading processes, such as focus and re-reading, using mouse tracking. The result was the PreferRead dataset, which revealed that annotators re-read responses in about half of trials, often revisiting their chosen option, and that re-reading correlates with higher inter-annotator agreement while longer reading paths correlate with lower agreement.

We propose an annotation approach that captures not only labels but also the reading process underlying annotators' decisions, e.g., what parts of the text they focus on, re-read or skim. Using this framework, we conduct a case study on the preference annotation task, creating a dataset PreferRead that contains fine-grained annotator reading behaviors obtained from mouse tracking. PreferRead enables detailed analysis of how annotators navigate between a prompt and two candidate responses before selecting their preference. We find that annotators re-read a response in roughly half of all trials, most often revisiting the option they ultimately choose, and rarely revisit the prompt. Reading behaviors are also significantly related to annotation outcomes: re-reading is associated with higher inter-annotator agreement, whereas long reading paths and times are associated with lower agreement. These results demonstrate that reading processes provide a complementary cognitive dimension for understanding annotator reliability, decision-making and disagreement in complex, subjective NLP tasks. Our code and data are publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes