CLApr 29, 2020

Empower Entity Set Expansion via Language Model Probing

Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han

arXiv:2004.13897v231.31006 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a key challenge in entity set expansion for NLP and IR applications, offering a novel solution to reduce accumulative errors, though it is incremental in its approach.

The paper tackles the problem of semantic drift in entity set expansion by proposing an iterative framework that uses automatically generated class names from a pre-trained language model to score candidate entities, achieving significant performance improvements over previous state-of-the-art methods on two datasets.

Entity set expansion, aiming at expanding a small seed entity set with new entities belonging to the same semantic class, is a critical task that benefits many downstream NLP and IR applications, such as question answering, query understanding, and taxonomy construction. Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities. A key challenge for entity set expansion is to avoid selecting ambiguous context features which will shift the class semantics and lead to accumulative errors in later iterations. In this study, we propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue. In each iteration, we select one positive and several negative class names by probing a pre-trained language model, and further score each candidate entity based on selected class names. Experiments on two datasets show that our framework generates high-quality class names and outperforms previous state-of-the-art methods significantly.

View on arXiv PDF Code

Similar