The DALPHI annotation framework & how its pre-annotations can improve annotator efficiency
This addresses annotation efficiency for ML/NLP practitioners, though it appears incremental as it builds on existing active learning assistance concepts.
The paper tackles the problem of repetitive human annotation work for ML/NLP training data by introducing the DALPHI framework, which provides automated pre-annotations via active learning. In a study with 66 participants on named entity annotation, it shows improvements in annotation quality and quantity even with pre-annotations at only 50% recall.
Producing the required amounts of training data for machine learning and NLP tasks often involves human annotators doing very repetitive and monotonous work. In this paper, we present and evaluate our novel annotation framework DALPHI, which facilitates the annotation process by providing the annotator with suggestions generated by an automated, active-learning based assistance system. In a study with 66 participants, we demonstrate on the exemplary task of annotating named entities in text documents that with this assistance system the annotation processes can be improved with respect to the quality and quantity of produced annotations, even if the pre-annotations provided by the assistance system are at a recall level of only 50%.