Feasible Plan Generation with Ambiguity-Boundedness in Cross-Model Query Processing
For developers of natural language interfaces to heterogeneous databases, this work provides a scalable method to handle plan ambiguity and feasibility, though it is an incremental improvement over existing packed parse forest ideas.
The paper tackles the problem of ambiguous intermediate logical plans (ILPs) in natural language interfaces to databases, which often lead to infeasible plans. It introduces the Packed Plan Forest (PPF), a polynomially bounded structure that compactly encodes all feasible ILPs while pruning infeasible ones early, achieving exponential compression with minimal overhead.
Natural language (NL) interfaces to databases broaden access to heterogeneous data but often yield many ambiguous intermediate logical plans (ILPs) due to uncertain operator scope and predicate semantics. Many candidates are infeasible because of type mismatches, missing bindings, or engine-specific constraints. We address this challenge with \emph{feasibility constraints} for detecting local inconsistencies and introduce the Packed Plan Forest (PPF) a polynomially bounded structure that compactly encodes all feasible ILPs while pruning infeasible ones early. Extending packed parse forest ideas to multi-model settings, PPF supports efficient feasibility analysis through annotated operators. Formal results show polynomial size under bounded arity and annotation vocabularies, and experiments confirm that PPFs capture exponentially many ILPs with minimal overhead, establishing a scalable foundation for NL-to-DB query planning across heterogeneous systems