RH-SQL: Refined Schema and Hardness Prompt for Text-to-SQL
This work addresses cost barriers for practical text-to-SQL applications, though it appears incremental as it builds on existing complexity-based methods.
The paper tackled the high storage and training costs in text-to-SQL methods by introducing a refined schema and hardness prompt approach, achieving an execution accuracy of 82.6% on the Spider dataset.
Text-to-SQL is a technology that converts natural language queries into the structured query language SQL. A novel research approach that has recently gained attention focuses on methods based on the complexity of SQL queries, achieving notable performance improvements. However, existing methods entail significant storage and training costs, which hampers their practical application. To address this issue, this paper introduces a method for Text-to-SQL based on Refined Schema and Hardness Prompt. By filtering out low-relevance schema information with a refined schema and identifying query hardness through a Language Model (LM) to form prompts, this method reduces storage and training costs while maintaining performance. It's worth mentioning that this method is applicable to any sequence-to-sequence (seq2seq) LM. Our experiments on the Spider dataset, specifically with large-scale LMs, achieved an exceptional Execution accuracy (EX) of 82.6%, demonstrating the effectiveness and greater suitability of our method for real-world applications.