The Impact of Annotator Personas on LLM Behavior Across the Perspectivism Spectrum
This work addresses the challenge of modeling subjective annotations in NLP for tasks like hate speech detection, but it is incremental as it builds on existing perspectivism and annotator modeling techniques.
The study investigated how Large Language Models (LLMs) annotate hate speech and abusiveness using predefined annotator personas across perspectivism spectra, finding that LLMs selectively use demographic attributes and that annotator modeling techniques without explicit annotator information performed better under weak perspectivism, though LLM performance approached but did not exceed human annotators for strong perspectivism.
In this work, we explore the capability of Large Language Models (LLMs) to annotate hate speech and abusiveness while considering predefined annotator personas within the strong-to-weak data perspectivism spectra. We evaluated LLM-generated annotations against existing annotator modeling techniques for perspective modeling. Our findings show that LLMs selectively use demographic attributes from the personas. We identified prototypical annotators, with persona features that show varying degrees of alignment with the original human annotators. Within the data perspectivism paradigm, annotator modeling techniques that do not explicitly rely on annotator information performed better under weak data perspectivism compared to both strong data perspectivism and human annotations, suggesting LLM-generated views tend towards aggregation despite subjective prompting. However, for more personalized datasets tailored to strong perspectivism, the performance of LLM annotator modeling approached, but did not exceed, human annotators.