AI CLNov 1, 2021

Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graphs

Yongrui Chen, Huiying Li, Guilin Qi, Tianxing Wu, Tenggou Wang

arXiv:2111.00732v213.834 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of answering complex natural language questions over knowledge graphs for applications like search and AI assistants, representing a strong specific gain rather than a foundational breakthrough.

The paper tackles the problem of constructing correct SPARQL queries from natural language questions over knowledge graphs, especially for complex queries with challenges like syntax complexity and large search spaces, by proposing a hierarchical autoregressive decoding model that improves state-of-the-art performance on benchmarks, achieving SOTA results on all three datasets used.

Query graph construction aims to construct the correct executable SPARQL on the KG to answer natural language questions. Although recent methods have achieved good results using neural network-based query graph ranking, they suffer from three new challenges when handling more complex questions: 1) complicated SPARQL syntax, 2) huge search space, and 3) locally ambiguous query graphs. In this paper, we provide a new solution. As a preparation, we extend the query graph by treating each SPARQL clause as a subgraph consisting of vertices and edges and define a unified graph grammar called AQG to describe the structure of query graphs. Based on these concepts, we propose a novel end-to-end model that performs hierarchical autoregressive decoding to generate query graphs. The high-level decoding generates an AQG as a constraint to prune the search space and reduce the locally ambiguous query graph. The bottom-level decoding accomplishes the query graph construction by selecting appropriate instances from the preprepared candidates to fill the slots in the AQG. The experimental results show that our method greatly improves the SOTA performance on complex KGQA benchmarks. Equipped with pre-trained models, the performance of our method is further improved, achieving SOTA for all three datasets used.

View on arXiv PDF Code

Similar