CLAILGOct 23, 2023

Linking Surface Facts to Large-Scale Knowledge Graphs

arXiv:2310.14909v1134 citationsh-index: 16Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of integrating high-coverage text extraction with precise knowledge representation for downstream AI applications, though it is incremental as it focuses on benchmarking rather than a new linking method.

The paper tackles the problem of linking ambiguous surface facts from Open Information Extraction to canonical entities in Knowledge Graphs, proposing a new benchmark with evaluation protocols that reveal detecting out-of-KG entities and predicates is more difficult than accurate linking.

Open Information Extraction (OIE) methods extract facts from natural language text in the form of ("subject"; "relation"; "object") triples. These facts are, however, merely surface forms, the ambiguity of which impedes their downstream usage; e.g., the surface phrase "Michael Jordan" may refer to either the former basketball player or the university professor. Knowledge Graphs (KGs), on the other hand, contain facts in a canonical (i.e., unambiguous) form, but their coverage is limited by a static schema (i.e., a fixed set of entities and predicates). To bridge this gap, we need the best of both worlds: (i) high coverage of free-text OIEs, and (ii) semantic precision (i.e., monosemy) of KGs. In order to achieve this goal, we propose a new benchmark with novel evaluation protocols that can, for example, measure fact linking performance on a granular triple slot level, while also measuring if a system has the ability to recognize that a surface form has no match in the existing KG. Our extensive evaluation of several baselines show that detection of out-of-KG entities and predicates is more difficult than accurate linking to existing ones, thus calling for more research efforts on this difficult task. We publicly release all resources (data, benchmark and code) on https://github.com/nec-research/fact-linking.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes