CL AIMar 9, 2025

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Feng Gu, Zongxia Li, Carlos Rafael Colon, Benjamin Evans, Ishani Mondal, Jordan Lee Boyd-Graber

arXiv:2503.06778v24 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses the inefficiency and cost of human annotation for event analysis, but it is incremental as it builds on existing LLM-assisted approaches.

The study tackled the problem of event annotation by evaluating a holistic workflow and found that while LLMs are not reliable independent annotators compared to human experts, they can assist experts to reduce time and mental effort, with LLM-assisted extraction leading to better agreement than fully automated methods.

Event annotation is important for identifying market changes, monitoring breaking news, and understanding sociological trends. Although expert annotators set the gold standards, human coding is expensive and inefficient. Unlike information extraction experiments that focus on single contexts, we evaluate a holistic workflow that removes irrelevant documents, merges documents about the same event, and annotates the events. Although LLM-based automated annotations are better than traditional TF-IDF-based methods or Event Set Curation, they are still not reliable annotators compared to human experts. However, adding LLMs to assist experts for Event Set Curation can reduce the time and mental effort required for Variable Annotation. When using LLMs to extract event variables to assist expert annotators, they agree more with the extracted variables than fully automated LLMs for annotation.

View on arXiv PDF

Similar