CLMar 18, 2024

Embedded Named Entity Recognition using Probing Classifiers

arXiv:2403.11747v213.825 citationsh-index: 5Has CodeEMNLP

Originality Incremental advance

AI Analysis

This addresses the need for efficient semantic extraction in applications like chat assistants and fact-checking, offering a novel method that is incremental in improving computational efficiency.

The paper tackles the problem of extracting named entities from streaming text generation without fine-tuning or separate models, proposing EMBER which achieves high token generation rates with only a 1% slowdown compared to a 43.64% baseline slowdown.

Streaming text generation has become a common way of increasing the responsiveness of language model powered applications, such as chat assistants. At the same time, extracting semantic information from generated text is a useful tool for applications such as automated fact checking or retrieval augmented generation. Currently, this requires either separate models during inference, which increases computational cost, or destructive fine-tuning of the language model. Instead, we propose an approach called EMBER which enables streaming named entity recognition in decoder-only language models without fine-tuning them and while incurring minimal additional computational cost at inference time. Specifically, our experiments show that EMBER maintains high token generation rates, with only a negligible decrease in speed of around 1% compared to a 43.64% slowdown measured for a baseline. We make our code and data available online, including a toolkit for training, testing, and deploying efficient token classification models optimized for streaming text generation.

View on arXiv PDF Code

Similar