CL LGMay 27, 2021

Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach

Jie Huang, Kevin Chen-Chuan Chang, Jinjun Xiong, Wen-mei Hwu

arXiv:2105.13255v131.5712 citationsh-index: 64Has Code

Originality Incremental advance

AI Analysis

This addresses the need for accurate term relevance measurement in NLP, with broad applicability across domains, though it is incremental in improving existing methods.

The paper tackles the problem of measuring fine-grained domain relevance of terms, crucial for NLP tasks, by proposing a hierarchical core-fringe approach that outperforms baselines and surpasses human performance in experiments.

We propose to measure fine-grained domain relevance - the degree that a term is relevant to a broad (e.g., computer science) or narrow (e.g., deep learning) domain. Such measurement is crucial for many downstream tasks in natural language processing. To handle long-tail terms, we build a core-anchored semantic graph, which uses core terms with rich description information to bridge the vast remaining fringe terms semantically. To support a fine-grained domain without relying on a matching corpus for supervision, we develop hierarchical core-fringe learning, which learns core and fringe terms jointly in a semi-supervised manner contextualized in the hierarchy of the domain. To reduce expensive human efforts, we employ automatic annotation and hierarchical positive-unlabeled learning. Our approach applies to big or small domains, covers head or tail terms, and requires little human effort. Extensive experiments demonstrate that our methods outperform strong baselines and even surpass professional human performance.

View on arXiv PDF Code

Similar