CLOct 9, 2020

iobes: A Library for Span-Level Processing

arXiv:2010.04373v11 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This addresses a practical problem for NLP researchers and practitioners by providing a tool to streamline span processing, though it is incremental as it builds on existing annotation schemes.

The paper tackles the lack of a standardized library for processing span-level annotations in NLP tasks like named entity recognition, and introduces iobes, an open-source library for parsing, converting, and handling token-level labels to improve fairness and comparability in metrics.

Many tasks in natural language processing, such as named entity recognition and slot-filling, involve identifying and labeling specific spans of text. In order to leverage common models, these tasks are often recast as sequence labeling tasks. Each token is given a label and these labels are prefixed with special tokens such as B- or I-. After a model assigns labels to each token, these prefixes are used to group the tokens into spans. Properly parsing these annotations is critical for producing fair and comparable metrics; however, despite its importance, there is not an easy-to-use, standardized, programmatically integratable library to help work with span labeling. To remedy this, we introduce our open-source library, iobes. iobes is used for parsing, converting, and processing spans represented as token level decisions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes