A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability
It assesses a practical problem for database querying using AI, but is incremental as it focuses on evaluating an existing model.
This paper evaluated ChatGPT's zero-shot Text-to-SQL capability on 12 benchmark datasets, finding it has strong performance with a 4.1% improvement over the state-of-the-art fine-tuned model in one scenario, though it generally lags behind SOTA.
This paper presents the first comprehensive analysis of ChatGPT's Text-to-SQL ability. Given the recent emergence of large-scale conversational language model ChatGPT and its impressive capabilities in both conversational abilities and code generation, we sought to evaluate its Text-to-SQL performance. We conducted experiments on 12 benchmark datasets with different languages, settings, or scenarios, and the results demonstrate that ChatGPT has strong text-to-SQL abilities. Although there is still a gap from the current state-of-the-art (SOTA) model performance, considering that the experiment was conducted in a zero-shot scenario, ChatGPT's performance is still impressive. Notably, in the ADVETA (RPL) scenario, the zero-shot ChatGPT even outperforms the SOTA model that requires fine-tuning on the Spider dataset by 4.1\%, demonstrating its potential for use in practical applications. To support further research in related fields, we have made the data generated by ChatGPT publicly available at https://github.com/THU-BPM/chatgpt-sql.