CLMLMay 17, 2021

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

arXiv:2105.07911v2632 citations
Originality Incremental advance
AI Analysis

This work addresses the text-to-SQL task for database query generation, showing incremental improvements in model performance.

The paper tackled the problem of sub-optimal performance in text-to-SQL generation by proposing a schema-aware denoising approach with auxiliary tasks and an improved decoding strategy, achieving new state-of-the-art results on the WikiSQL benchmark.

In text-to-SQL task, seq-to-seq models often lead to sub-optimal performance due to limitations in their architecture. In this paper, we present a simple yet effective approach that adapts transformer-based seq-to-seq model to robust text-to-SQL generation. Instead of inducing constraint to decoder or reformat the task as slot-filling, we propose to train seq-to-seq model with Schema aware Denoising (SeaD), which consists of two denoising objectives that train model to either recover input or predict output from two novel erosion and shuffle noises. These denoising objectives acts as the auxiliary tasks for better modeling the structural data in S2S generation. In addition, we improve and propose a clause-sensitive execution guided (EG) decoding strategy to overcome the limitation of EG decoding for generative model. The experiments show that the proposed method improves the performance of seq-to-seq model in both schema linking and grammar correctness and establishes new state-of-the-art on WikiSQL benchmark. The results indicate that the capacity of vanilla seq-to-seq architecture for text-to-SQL may have been under-estimated.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes