CL AI DBJul 19, 2024

SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy

Tingkai Zhang, Chaoyu Chen, Cong Liao, Jun Wang, Xudong Zhao, Hang Yu, Jianchao Wang, Jianguo Li, Wenhui Shi

arXiv:2407.14568v17.714 citationsh-index: 11Has Code

Originality Incremental advance

AI Analysis

This work enhances text-to-SQL translation for users in business and data roles, though it appears incremental as it builds on existing LLM advancements.

The paper tackles the problem of improving text-to-SQL conversion by addressing limitations in leveraging open-source LLMs, resulting in SQLfuse, a system that achieves leading performance on the Spider Leaderboard and has been deployed by Ant Group.

Text-to-SQL conversion is a critical innovation, simplifying the transition from complex SQL to intuitive natural language queries, especially significant given SQL's prevalence in the job market across various roles. The rise of Large Language Models (LLMs) like GPT-3.5 and GPT-4 has greatly advanced this field, offering improved natural language understanding and the ability to generate nuanced SQL statements. However, the potential of open-source LLMs in Text-to-SQL applications remains underexplored, with many frameworks failing to leverage their full capabilities, particularly in handling complex database queries and incorporating feedback for iterative refinement. Addressing these limitations, this paper introduces SQLfuse, a robust system integrating open-source LLMs with a suite of tools to enhance Text-to-SQL translation's accuracy and usability. SQLfuse features four modules: schema mining, schema linking, SQL generation, and a SQL critic module, to not only generate but also continuously enhance SQL query quality. Demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group, SQLfuse showcases the practical merits of open-source LLMs in diverse business contexts.

View on arXiv PDF

Similar