Rater Cohesion and Quality from a Vicarious Perspective
This work addresses the challenge of improving annotation quality in politically charged domains like AI safety and content moderation, but it is incremental as it builds on existing vicarious annotation and rater quality methods.
The paper tackles the problem of rater disagreement in human feedback for AI systems by exploring vicarious annotation and rater cohesion metrics, finding that these methods can moderate disagreements influenced by political affiliations and demographics, though specific numerical results are not provided.
Human feedback is essential for building human-centered AI systems across domains where disagreement is prevalent, such as AI safety, content moderation, or sentiment analysis. Many disagreements, particularly in politically charged settings, arise because raters have opposing values or beliefs. Vicarious annotation is a method for breaking down disagreement by asking raters how they think others would annotate the data. In this paper, we explore the use of vicarious annotation with analytical methods for moderating rater disagreement. We employ rater cohesion metrics to study the potential influence of political affiliations and demographic backgrounds on raters' perceptions of offense. Additionally, we utilize CrowdTruth's rater quality metrics, which consider the demographics of the raters, to score the raters and their annotations. We study how the rater quality metrics influence the in-group and cross-group rater cohesion across the personal and vicarious levels.