CLAIJul 5, 2021

Weakly Supervised Named Entity Tagging with Learnable Logical Rules

arXiv:2107.02282v1716 citations
Originality Highly original
AI Analysis

This work addresses the challenge of rapidly building entity taggers in emerging domains with minimal supervision, representing an incremental improvement over previous methods by automating rule generation and boundary detection.

The paper tackles the problem of building named entity tagging systems using only a few rules as weak supervision, proposing TALLOR, a method that bootstraps high-quality logical rules to train a neural tagger automatically, achieving performance that outperforms other weakly supervised methods and rivals a state-of-the-art distantly supervised tagger with over 2,000 terms when starting from only 20 simple rules.

We study the problem of building entity tagging systems by using a few rules as weak supervision. Previous methods mostly focus on disambiguation entity types based on contexts and expert-provided rules, while assuming entity spans are given. In this work, we propose a novel method TALLOR that bootstraps high-quality logical rules to train a neural tagger in a fully automated manner. Specifically, we introduce compound rules that are composed from simple rules to increase the precision of boundary detection and generate more diverse pseudo labels. We further design a dynamic label selection strategy to ensure pseudo label quality and therefore avoid overfitting the neural tagger. Experiments on three datasets demonstrate that our method outperforms other weakly supervised methods and even rivals a state-of-the-art distantly supervised tagger with a lexicon of over 2,000 terms when starting from only 20 simple rules. Our method can serve as a tool for rapidly building taggers in emerging domains and tasks. Case studies show that learned rules can potentially explain the predicted entities.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes