Chris Kamphuis

2papers

2 Papers

IRMar 18, 2020Code
Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format

Jimmy Lin, Joel Mackenzie, Chris Kamphuis et al.

There exists a natural tension between encouraging a diverse ecosystem of open-source search engines and supporting fair, replicable comparisons across those systems. To balance these two goals, we examine two approaches to providing interoperability between the inverted indexes of several systems. The first takes advantage of internal abstractions around index structures and building wrappers that allow one system to directly read the indexes of another. The second involves sharing indexes across systems via a data exchange specification that we have developed, called the Common Index File Format (CIFF). We demonstrate the first approach with the Java systems Anserini and Terrier, and the second approach with Anserini, JASSv2, OldDog, PISA, and Terrier. Together, these systems provide a wide range of implementations and features, with different research goals. Overall, we recommend CIFF as a low-effort approach to support independent innovation while enabling the types of fair evaluations that are critical for driving the field forward.

IROct 23, 2020
Exploring task-based query expansion at the TREC-COVID track

Thomas Schoegje, Chris Kamphuis, Koen Dercksen et al.

We explore how to generate effective queries based on search tasks. Our approach has three main steps: 1) identify search tasks based on research goals, 2) manually classify search queries according to those tasks, and 3) compare three methods to improve search rankings based on the task context. The most promising approach is based on expanding the user's query terms using task terms, which slightly improved the NDCG@20 scores over a BM25 baseline. Further improvements might be gained if we can identify more specific search tasks.