Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval
This addresses a critical bottleneck in open-domain QA systems for users needing accurate answers from complex multi-table queries, though it is an incremental improvement over existing retrieval methods.
The paper tackles the problem of retrieving multiple tables that need to be joined to answer questions in open-domain QA, where previous methods fail by not considering join plans during retrieval. It introduces a re-ranking method that infers join relationships, improving table retrieval F1 by up to 9.3% and QA accuracy by up to 5.4%.
Retrieving relevant tables containing the necessary information to accurately answer a given question over tables is critical to open-domain question-answering (QA) systems. Previous methods assume the answer to such a question can be found either in a single table or multiple tables identified through question decomposition or rewriting. However, neither of these approaches is sufficient, as many questions require retrieving multiple tables and joining them through a join plan that cannot be discerned from the user query itself. If the join plan is not considered in the retrieval stage, the subsequent steps of reasoning and answering based on those retrieved tables are likely to be incorrect. To address this problem, we introduce a method that uncovers useful join relations for any query and database during table retrieval. We use a novel re-ranking method formulated as a mixed-integer program that considers not only table-query relevance but also table-table relevance that requires inferring join relationships. Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.