CLOct 9, 2020

iobes: A Library for Span-Level Processing

arXiv:2010.04373v10.31 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses a practical problem for NLP researchers and practitioners by providing a tool to streamline span processing, though it is incremental as it builds on existing annotation schemes.

The paper tackles the lack of a standardized library for processing span-level annotations in NLP tasks like named entity recognition, and introduces iobes, an open-source library for parsing, converting, and handling token-level labels to improve fairness and comparability in metrics.

Many tasks in natural language processing, such as named entity recognition and slot-filling, involve identifying and labeling specific spans of text. In order to leverage common models, these tasks are often recast as sequence labeling tasks. Each token is given a label and these labels are prefixed with special tokens such as B- or I-. After a model assigns labels to each token, these prefixes are used to group the tokens into spans. Properly parsing these annotations is critical for producing fair and comparable metrics; however, despite its importance, there is not an easy-to-use, standardized, programmatically integratable library to help work with span labeling. To remedy this, we introduce our open-source library, iobes. iobes is used for parsing, converting, and processing spans represented as token level decisions.

View on arXiv PDF Code

Similar