A Transfer-Learnable Natural Language Interface for Databases
This addresses the need for scalable and adaptable natural language interfaces for databases, offering a solution that reduces customization efforts for specific databases, though it builds incrementally on existing methods.
The paper tackles the problem of creating a general-purpose natural language interface for any relational database by introducing a transfer-learnable model that separates schema and data with natural language knowledge, achieving state-of-the-art performance on WikiSQL and transferability to OVERNIGHT without retraining.
Relational database management systems (RDBMSs) are powerful because they are able to optimize and answer queries against any relational database. A natural language interface (NLI) for a database, on the other hand, is tailored to support that specific database. In this work, we introduce a general purpose transfer-learnable NLI with the goal of learning one model that can be used as NLI for any relational database. We adopt the data management principle of separating data and its schema, but with the additional support for the idiosyncrasy and complexity of natural languages. Specifically, we introduce an automatic annotation mechanism that separates the schema and the data, where the schema also covers knowledge about natural language. Furthermore, we propose a customized sequence model that translates annotated natural language queries to SQL statements. We show in experiments that our approach outperforms previous NLI methods on the WikiSQL dataset and the model we learned can be applied to another benchmark dataset OVERNIGHT without retraining.