Data-aware candidate selection in NL2SQL translation via small separating instances
Improves candidate selection in NL2SQL for developers working with limited candidate sets, but the evaluation is on a subset and incremental.
The paper proposes a data-aware candidate selection method for NL2SQL translation that outperforms baselines when only 2-3 candidates are available without consistency scores, evaluated on a subset of BIRD-DEV.
We propose a data-aware candidate selection method for NL2SQL translation based on separating instances and provenance. We implement this approach and evaluate it against three natural baselines on a subset of BIRD-DEV. Experiments show that our method significantly outperforms baselines when only two or three candidates are given and no consistency score is available. The code of our prototype can be found at https://github.com/staskikotx/SISelection