CLAug 29, 2019

Global Reasoning over Database Structures for Text-to-SQL Parsing

Ben Bogin, Matt Gardner, Jonathan Berant

arXiv:1908.11214v130.61033 citationsHas Code

Originality Highly original

AI Analysis

This addresses the challenge of accurate SQL query generation for unseen databases, which is incremental as it builds on existing state-of-the-art models.

The paper tackles the problem of zero-shot text-to-SQL parsing on complex databases by proposing a semantic parser that globally reasons over database structures to improve selection of database constants, increasing accuracy on the Spider dataset from 39.4% to 47.4%.

State-of-the-art semantic parsers rely on auto-regressive decoding, emitting one symbol at a time. When tested against complex databases that are unobserved at training time (zero-shot), the parser often struggles to select the correct set of database constants in the new database, due to the local nature of decoding. In this work, we propose a semantic parser that globally reasons about the structure of the output query to make a more contextually-informed selection of database constants. We use message-passing through a graph neural network to softly select a subset of database constants for the output query, conditioned on the question. Moreover, we train a model to rank queries based on the global alignment of database constants to question words. We apply our techniques to the current state-of-the-art model for Spider, a zero-shot semantic parsing dataset with complex databases, increasing accuracy from 39.4% to 47.4%.

View on arXiv PDF Code

Similar