AIOct 30, 2024

BIS: NL2SQL Service Evaluation Benchmark for Business Intelligence Scenarios

arXiv:2410.22925v14 citationsh-index: 17ICSOC
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of evaluating NL2SQL services for business intelligence applications, but it is incremental as it builds on existing benchmark concepts with a domain-specific focus.

The authors tackled the lack of suitable NL2SQL benchmarks for production business intelligence scenarios by developing a new benchmark focused on typical industrial BI questions, and they proposed two novel semantic similarity evaluation metrics for assessing NL2SQL capabilities.

NL2SQL (Natural Language to Structured Query Language) transformation has seen wide adoption in Business Intelligence (BI) applications in recent years. However, existing NL2SQL benchmarks are not suitable for production BI scenarios, as they are not designed for common business intelligence questions. To address this gap, we have developed a new benchmark focused on typical NL questions in industrial BI scenarios. We discuss the challenges of constructing a BI-focused benchmark and the shortcomings of existing benchmarks. Additionally, we introduce question categories in our benchmark that reflect common BI inquiries. Lastly, we propose two novel semantic similarity evaluation metrics for assessing NL2SQL capabilities in BI applications and services.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes