CLMar 18, 2024

Embedded Named Entity Recognition using Probing Classifiers

arXiv:2403.11747v225 citationsh-index: 5Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the need for efficient semantic extraction in applications like chat assistants and fact-checking, offering a novel method that is incremental in improving computational efficiency.

The paper tackles the problem of extracting named entities from streaming text generation without fine-tuning or separate models, proposing EMBER which achieves high token generation rates with only a 1% slowdown compared to a 43.64% baseline slowdown.

Streaming text generation has become a common way of increasing the responsiveness of language model powered applications, such as chat assistants. At the same time, extracting semantic information from generated text is a useful tool for applications such as automated fact checking or retrieval augmented generation. Currently, this requires either separate models during inference, which increases computational cost, or destructive fine-tuning of the language model. Instead, we propose an approach called EMBER which enables streaming named entity recognition in decoder-only language models without fine-tuning them and while incurring minimal additional computational cost at inference time. Specifically, our experiments show that EMBER maintains high token generation rates, with only a negligible decrease in speed of around 1% compared to a 43.64% slowdown measured for a baseline. We make our code and data available online, including a toolkit for training, testing, and deploying efficient token classification models optimized for streaming text generation.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes