CLOct 21, 2020

DuoRAT: Towards Simpler Text-to-SQL Models

Torsten Scholak, Raymond Li, Dzmitry Bahdanau, Harm de Vries, Chris Pal

arXiv:2010.11119v2735 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of model complexity for researchers and practitioners in text-to-SQL, but it is incremental as it builds on existing methods.

The paper tackles the complexity of neural text-to-SQL models by simplifying the state-of-the-art RAT-SQL model into DuoRAT, which uses only relation-aware or vanilla transformers, and finds that some techniques like structural SQL features are redundant.

Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. Working mostly on the Spider dataset, researchers have proposed increasingly sophisticated solutions to the problem. Contrary to this trend, in this paper we focus on simplifications. We begin by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAT-SQL is using only relation-aware or vanilla transformers as the building blocks. We perform several ablation experiments using DuoRAT as the baseline model. Our experiments confirm the usefulness of some techniques and point out the redundancy of others, including structural SQL features and features that link the question with the schema.

View on arXiv PDF

Similar