CLSep 12, 2021

Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning

Yongrui Chen, Xinnan Guo, Chaojie Wang, Jian Qiu, Guilin Qi, Meng Wang, Huiying Li

arXiv:2109.05395v11.411 citationsHas Code

Originality Highly original

AI Analysis

This addresses a critical bottleneck in applying text-to-SQL models to real-world scenarios with unseen tables, offering a practical solution without costly annotations.

The paper tackles the problem of zero-shot text-to-SQL, where models must generate SQL queries for tables not seen during training, by proposing a method that leverages table content without extra manual annotations and uses meta-learning for generalization. The approach achieves significant improvements on WikiSQL and ESQL datasets, with further gains on zero-shot subsets.

Single-table text-to-SQL aims to transform a natural language question into a SQL query according to one single table. Recent work has made promising progress on this task by pre-trained language models and a multi-submodule framework. However, zero-shot table, that is, the invisible table in the training set, is currently the most critical bottleneck restricting the application of existing approaches to real-world scenarios. Although some work has utilized auxiliary tasks to help handle zero-shot tables, expensive extra manual annotation limits their practicality. In this paper, we propose a new approach for the zero-shot text-to-SQL task which does not rely on any additional manual annotations. Our approach consists of two parts. First, we propose a new model that leverages the abundant information of table content to help establish the mapping between questions and zero-shot tables. Further, we propose a simple but efficient meta-learning strategy to train our model. The strategy utilizes the two-step gradient update to force the model to learn a generalization ability towards zero-shot tables. We conduct extensive experiments on a public open-domain text-to-SQL dataset WikiSQL and a domain-specific dataset ESQL. Compared to existing approaches using the same pre-trained model, our approach achieves significant improvements on both datasets. Compared to the larger pre-trained model and the tabular-specific pre-trained model, our approach is still competitive. More importantly, on the zero-shot subsets of both the datasets, our approach further increases the improvements.

View on arXiv PDF Code

Similar