CLAINov 28, 2024

EzSQL: An SQL intermediate representation for improving SQL-to-text Generation

arXiv:2411.18923v22 citationsh-index: 6Expert syst appl
Originality Incremental advance
AI Analysis

This work addresses the problem of generating natural language descriptions from SQL queries for database users, presenting an incremental improvement over existing methods.

The paper tackles the SQL-to-text generation problem by proposing EzSQL, a new SQL intermediate representation that simplifies queries to align with natural language, and demonstrates it as an effective state-of-the-art method on WikiSQL and Spider datasets, also showing it can enhance Text-to-SQL parsers through pretraining data generation.

The SQL-to-text generation task traditionally uses template base, Seq2Seq, tree-to-sequence, and graph-to-sequence models. Recent models take advantage of pre-trained generative language models for this task in the Seq2Seq framework. However, treating SQL as a sequence of inputs to the pre-trained models is not optimal. In this work, we put forward a new SQL intermediate representation called EzSQL to align SQL with the natural language text sequence. EzSQL simplifies the SQL queries and brings them closer to natural language text by modifying operators and keywords, which can usually be described in natural language. EzSQL also removes the need for set operators. Our proposed SQL-to-text generation model uses EzSQL as the input to a pre-trained generative language model for generating the text descriptions. We demonstrate that our model is an effective state-of-the-art method to generate text narrations from SQL queries on the WikiSQL and Spider datasets. We also show that by generating pretraining data using our SQL-to-text generation model, we can enhance the performance of Text-to-SQL parsers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes