DBCLMay 29, 2025

TailorSQL: An NL2SQL System Tailored to Your Query Workload

arXiv:2505.23039v14 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the challenge of making structured data more accessible to non-technical users in data applications, though it is incremental by building on existing NL2SQL techniques.

The paper tackled the problem of improving NL2SQL translation by leveraging past query workloads, which contain implicit information like common join paths and semantics, resulting in up to 2x improvement in execution accuracy on benchmarks.

NL2SQL (natural language to SQL) translates natural language questions into SQL queries, thereby making structured data accessible to non-technical users, serving as the foundation for intelligent data applications. State-of-the-art NL2SQL techniques typically perform translation by retrieving database-specific information, such as the database schema, and invoking a pre-trained large language model (LLM) using the question and retrieved information to generate the SQL query. However, existing NL2SQL techniques miss a key opportunity which is present in real-world settings: NL2SQL is typically applied on existing databases which have already served many SQL queries in the past. The past query workload implicitly contains information which is helpful for accurate NL2SQL translation and is not apparent from the database schema alone, such as common join paths and the semantics of obscurely-named tables and columns. We introduce TailorSQL, a NL2SQL system that takes advantage of information in the past query workload to improve both the accuracy and latency of translating natural language questions into SQL. By specializing to a given workload, TailorSQL achieves up to 2$\times$ improvement in execution accuracy on standardized benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes