IR CLJan 24, 2022

HC4: A New Suite of Test Collections for Ad Hoc CLIR

Dawn Lawrie, James Mayfield, Douglas Oard, Eugene Yang

arXiv:2201.09992v112.235 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a more reliable benchmark for researchers in cross-language information retrieval, addressing systematic gaps in existing collections, though it is incremental as it builds on prior test collection designs.

The authors tackled the problem of evaluating neural cross-language information retrieval methods by creating HC4, a new suite of test collections with graded relevance judgments for Chinese, Persian, and Russian, containing up to 5 million documents and 60 topics per language.

HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in English and in the document languages, and graded relevance judgments. New test collections are needed because existing CLIR test collections built using pooling of traditional CLIR runs have systematic gaps in their relevance judgments when used to evaluate neural CLIR methods. The HC4 collections contain 60 topics and about half a million documents for each of Chinese and Persian, and 54 topics and five million documents for Russian. Active learning was used to determine which documents to annotate after being seeded using interactive search and judgment. Documents were judged on a three-grade relevance scale. This paper describes the design and construction of the new test collections and provides baseline results for demonstrating their utility for evaluating systems.

View on arXiv PDF Code

Similar