Ad Hoc Table Retrieval using Semantic Similarity
This addresses the need for efficient table retrieval in information access scenarios like table completion or mining, though it is incremental as it builds on existing retrieval frameworks.
The paper tackles the problem of ad hoc table retrieval by ranking tables in response to keyword queries, introducing a method that uses multiple semantic representations and similarity measures as features in a supervised model, achieving significant improvements over a state-of-the-art baseline on a Wikipedia-based test collection.
We introduce and address the problem of ad hoc table retrieval: answering a keyword query with a ranked list of tables. This task is not only interesting on its own account, but is also being used as a core component in many other table-based information access scenarios, such as table completion or table mining. The main novel contribution of this work is a method for performing semantic matching between queries and tables. Specifically, we (i) represent queries and tables in multiple semantic spaces (both discrete sparse and continuous dense vector representations) and (ii) introduce various similarity measures for matching those semantic representations. We consider all possible combinations of semantic representations and similarity measures and use these as features in a supervised learning model. Using a purpose-built test collection based on Wikipedia tables, we demonstrate significant and substantial improvements over a state-of-the-art baseline.