Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models
This work addresses a specific bottleneck in applying LLMs to social science text annotation, offering a practical tool for researchers to enhance accuracy.
The study tackled the problem of inconsistent labeling accuracy in text annotation for social sciences using Large Language Models by investigating prompt selection, and found that performance varies greatly between prompts, with automatic prompt optimization systematically improving results.
Large Language Models have recently been applied to text annotation tasks from social sciences, equalling or surpassing the performance of human workers at a fraction of the cost. However, no inquiry has yet been made on the impact of prompt selection on labelling accuracy. In this study, we show that performance greatly varies between prompts, and we apply the method of automatic prompt optimization to systematically craft high quality prompts. We also provide the community with a simple, browser-based implementation of the method at https://prompt-ultra.github.io/ .