CLJun 20, 2024

Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology

Federico Ruggeri, Eleonora Misino, Arianna Muti, Katerina Korre, Paolo Torroni, Alberto Barrón-Cedeño

arXiv:2406.14099v33.43 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses annotation quality issues for researchers and practitioners in machine learning, but it is incremental as it builds on existing prescriptive methods.

The paper tackles the problem of information loss and guideline adherence in data annotation by introducing GCAM, a methodology that reports guidelines per sample, and shows it improves annotation quality and enables efficient data reuse across tasks.

We introduce the Guideline-Centered Annotation Methodology (GCAM), a novel data annotation methodology designed to report the annotation guidelines associated with each data sample. Our approach addresses three key limitations of the standard prescriptive annotation methodology by reducing the information loss during annotation and ensuring adherence to guidelines. Furthermore, GCAM enables the efficient reuse of annotated data across multiple tasks. We evaluate GCAM in two ways: (i) through a human annotation study and (ii) an experimental evaluation with several machine learning models. Our results highlight the advantages of GCAM from multiple perspectives, demonstrating its potential to improve annotation quality and error analysis.

View on arXiv PDF Code

Similar