Evaluating the Impact of LLM-Assisted Annotation in a Perspectivized Setting: the Case of FrameNet Annotation
This addresses the need for comprehensive evaluation of LLM tools in creating annotated datasets for linguistic research, particularly under perspectivized NLP, but is incremental as it builds on existing methods.
The paper tackled the problem of evaluating LLM-assisted annotation for FrameNet-like semantic role labeling, finding that a semi-automatic hybrid approach increased frame diversity and maintained similar coverage compared to manual annotation, while fully automatic annotation performed worse except in speed.
The use of LLM-based applications as a means to accelerate and/or substitute human labor in the creation of language resources and dataset is a reality. Nonetheless, despite the potential of such tools for linguistic research, comprehensive evaluation of their performance and impact on the creation of annotated datasets, especially under a perspectivized approach to NLP, is still missing. This paper contributes to reduction of this gap by reporting on an extensive evaluation of the (semi-)automatization of FrameNet-like semantic annotation by the use of an LLM-based semantic role labeler. The methodology employed compares annotation time, coverage and diversity in three experimental settings: manual, automatic and semi-automatic annotation. Results show that the hybrid, semi-automatic annotation setting leads to increased frame diversity and similar annotation coverage, when compared to the human-only setting, while the automatic setting performs considerably worse in all metrics, except for annotation time.