CLApr 24, 2024

Annotator-Centric Active Learning for Subjective NLP Tasks

Michiel van der Meer, Neele Falk, Pradeep K. Murukannaiah, Enrico Liscio

arXiv:2404.15720v417.530 citationsh-index: 14Has CodeEMNLP

Originality Incremental advance

AI Analysis

This addresses the problem of high annotation costs and variability in human perspectives for NLP researchers, though it is incremental as it builds on existing active learning methods.

The paper tackled the challenge of capturing diverse human judgments in subjective NLP tasks by introducing Annotator-Centric Active Learning (ACAL), which improved data efficiency and performed well in annotator-centric evaluations across seven tasks.

Active Learning (AL) addresses the high costs of collecting human annotations by strategically annotating the most informative samples. However, for subjective NLP tasks, incorporating a wide range of perspectives in the annotation process is crucial to capture the variability in human judgments. We introduce Annotator-Centric Active Learning (ACAL), which incorporates an annotator selection strategy following data sampling. Our objective is two-fold: 1) to efficiently approximate the full diversity of human judgments, and 2) to assess model performance using annotator-centric metrics, which value minority and majority perspectives equally. We experiment with multiple annotator selection strategies across seven subjective NLP tasks, employing both traditional and novel, human-centered evaluation metrics. Our findings indicate that ACAL improves data efficiency and excels in annotator-centric performance evaluations. However, its success depends on the availability of a sufficiently large and diverse pool of annotators to sample from.

View on arXiv PDF Code

Similar