Personas with Attitudes: Controlling LLMs for Diverse Data Annotation
This addresses the need for improved data annotation in subjective NLP tasks like toxicity detection, though it is incremental as it builds on existing LLM prompting techniques.
The paper tackled the problem of limited diversity in data annotation by personalizing large language models (LLMs) with diverse persona descriptions, resulting in more diverse annotations that are controllable and repeatable.
We present a novel approach for enhancing diversity and control in data annotation tasks by personalizing large language models (LLMs). We investigate the impact of injecting diverse persona descriptions into LLM prompts across two studies, exploring whether personas increase annotation diversity and whether the impacts of individual personas on the resulting annotations are consistent and controllable. Our results show that persona-prompted LLMs produce more diverse annotations than LLMs prompted without personas and that these effects are both controllable and repeatable, making our approach a suitable tool for improving data annotation in subjective NLP tasks like toxicity detection.