CLAIAug 31, 2017

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

arXiv:1709.00103v71524 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge for non-expert users to retrieve data from relational databases without knowing SQL, though it is incremental as it builds on existing sequence-to-sequence methods with specific optimizations.

The authors tackled the problem of translating natural language questions to SQL queries for database access, proposing Seq2SQL, a deep neural network that uses reinforcement learning to improve query generation, resulting in execution accuracy increasing from 35.9% to 59.4% and logical form accuracy from 23.4% to 48.3% on the WikiSQL dataset.

A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. Our model leverages the structure of SQL queries to significantly reduce the output space of generated queries. Moreover, we use rewards from in-the-loop query execution over the database to learn a policy to generate unordered parts of the query, which we show are less suitable for optimization via cross entropy loss. In addition, we will publish WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia. This dataset is required to train our model and is an order of magnitude larger than comparable datasets. By applying policy-based reinforcement learning with a query execution environment to WikiSQL, our model Seq2SQL outperforms attentional sequence to sequence models, improving execution accuracy from 35.9% to 59.4% and logical form accuracy from 23.4% to 48.3%.

Code Implementations15 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes