CLSep 12, 2021

Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning

arXiv:2109.05395v111 citations
Originality Highly original
AI Analysis

This addresses a critical bottleneck in applying text-to-SQL models to real-world scenarios with unseen tables, offering a practical solution without costly annotations.

The paper tackles the problem of zero-shot text-to-SQL, where models must generate SQL queries for tables not seen during training, by proposing a method that leverages table content without extra manual annotations and uses meta-learning for generalization. The approach achieves significant improvements on WikiSQL and ESQL datasets, with further gains on zero-shot subsets.

Single-table text-to-SQL aims to transform a natural language question into a SQL query according to one single table. Recent work has made promising progress on this task by pre-trained language models and a multi-submodule framework. However, zero-shot table, that is, the invisible table in the training set, is currently the most critical bottleneck restricting the application of existing approaches to real-world scenarios. Although some work has utilized auxiliary tasks to help handle zero-shot tables, expensive extra manual annotation limits their practicality. In this paper, we propose a new approach for the zero-shot text-to-SQL task which does not rely on any additional manual annotations. Our approach consists of two parts. First, we propose a new model that leverages the abundant information of table content to help establish the mapping between questions and zero-shot tables. Further, we propose a simple but efficient meta-learning strategy to train our model. The strategy utilizes the two-step gradient update to force the model to learn a generalization ability towards zero-shot tables. We conduct extensive experiments on a public open-domain text-to-SQL dataset WikiSQL and a domain-specific dataset ESQL. Compared to existing approaches using the same pre-trained model, our approach achieves significant improvements on both datasets. Compared to the larger pre-trained model and the tabular-specific pre-trained model, our approach is still competitive. More importantly, on the zero-shot subsets of both the datasets, our approach further increases the improvements.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes