ALANNO: An Active Learning Annotation System for Mortals
This provides a practical tool for researchers and practitioners in NLP to reduce labeling costs, though it is incremental as it builds on existing active learning methods.
The paper tackles the challenge of expensive and tedious data labeling in supervised machine learning by developing ALANNO, an open-source active learning annotation system for NLP tasks, which includes features to manage multi-annotator setups and support configurable AL methods and models.
Supervised machine learning has become the cornerstone of today's data-driven society, increasing the need for labeled data. However, the process of acquiring labels is often expensive and tedious. One possible remedy is to use active learning (AL) -- a special family of machine learning algorithms designed to reduce labeling costs. Although AL has been successful in practice, a number of practical challenges hinder its effectiveness and are often overlooked in existing AL annotation tools. To address these challenges, we developed ALANNO, an open-source annotation system for NLP tasks equipped with features to make AL effective in real-world annotation projects. ALANNO facilitates annotation management in a multi-annotator setup and supports a variety of AL methods and underlying models, which are easily configurable and extensible.