A Semantic Search Pipeline for Causality-driven Adhoc Information Retrieval
This addresses the challenge of causality-driven information retrieval for users needing to find causal relationships in documents, though it is incremental as it builds on existing semantic and lexical methods.
The paper tackled the problem of retrieving documents containing likely causes of a query event in the CAIR-2021 shared task, and the result was that their unsupervised semantic search pipeline outperformed traditional IR and pure semantic embedding-based approaches, leading the leaderboard.
We present a unsupervised semantic search pipeline for the Causality-driven Adhoc Information Retrieval (CAIR-2021) shared task. The CAIR shared task expands traditional information retrieval to support the retrieval of documents containing the likely causes of a query event. A successful system must be able to distinguish between topical documents and documents containing causal descriptions of events that are causally related to the query event. Our approach involves aggregating results from multiple query strategies over a semantic and lexical index. The proposed approach leads the CAIR-2021 leaderboard and outperformed both traditional IR and pure semantic embedding-based approaches.