IR AI CL DB LGNov 17, 2025

Exploring Multi-Table Retrieval Through Iterative Search

Allaa Boutaleb, Bernd Amann, Rafael Angarita, Hubert Naacke

arXiv:2511.13418v16.31 citationsh-index: 12

Originality Incremental advance

AI Analysis

This work addresses the challenge of scalable and coherent multi-table retrieval for question answering systems, offering a practical solution with incremental improvements in speed.

The paper tackled the problem of retrieving and composing information from multiple tables for open-domain question answering by proposing an iterative search framework, achieving competitive retrieval performance while being 4-400x faster than exact optimization methods.

Open-domain question answering over datalakes requires retrieving and composing information from multiple tables, a challenging subtask that demands semantic relevance and structural coherence (e.g., joinability). While exact optimization methods like Mixed-Integer Programming (MIP) can ensure coherence, their computational complexity is often prohibitive. Conversely, simpler greedy heuristics that optimize for query coverage alone often fail to find these coherent, joinable sets. This paper frames multi-table retrieval as an iterative search process, arguing this approach offers advantages in scalability, interpretability, and flexibility. We propose a general framework and a concrete instantiation: a fast, effective Greedy Join-Aware Retrieval algorithm that holistically balances relevance, coverage, and joinability. Experiments across 5 NL2SQL benchmarks demonstrate that our iterative method achieves competitive retrieval performance compared to the MIP-based approach while being 4-400x faster depending on the benchmark and search space settings. This work highlights the potential of iterative heuristics for practical, scalable, and composition-aware retrieval.

View on arXiv PDF

Similar