Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation
This addresses the challenge of text-to-SQL generation for complex, multi-table queries across different domains, representing an incremental improvement over prior work.
The paper tackles the problem of generating complex and cross-domain SQL queries from natural language, focusing on the Spider dataset, and achieves accuracy gains of 4.6% on the test set and 9.8% on the dev set.
Most deep learning approaches for text-to-SQL generation are limited to the WikiSQL dataset, which only supports very simple queries over a single table. We focus on the Spider dataset, a complex and cross-domain text-to-SQL task, which includes complex queries over multiple tables. In this paper, we propose a SQL clause-wise decoding neural architecture with a self-attention based database schema encoder to address the Spider task. Each of the clause-specific decoders consists of a set of sub-modules, which is defined by the syntax of each clause. Additionally, our model works recursively to support nested queries. When evaluated on the Spider dataset, our approach achieves 4.6\% and 9.8\% accuracy gain in the test and dev sets, respectively. In addition, we show that our model is significantly more effective at predicting complex and nested queries than previous work.