IRCLJan 24, 2022

HC4: A New Suite of Test Collections for Ad Hoc CLIR

arXiv:2201.09992v135 citations
Originality Synthesis-oriented
AI Analysis

This provides a more reliable benchmark for researchers in cross-language information retrieval, addressing systematic gaps in existing collections, though it is incremental as it builds on prior test collection designs.

The authors tackled the problem of evaluating neural cross-language information retrieval methods by creating HC4, a new suite of test collections with graded relevance judgments for Chinese, Persian, and Russian, containing up to 5 million documents and 60 topics per language.

HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in English and in the document languages, and graded relevance judgments. New test collections are needed because existing CLIR test collections built using pooling of traditional CLIR runs have systematic gaps in their relevance judgments when used to evaluate neural CLIR methods. The HC4 collections contain 60 topics and about half a million documents for each of Chinese and Persian, and 54 topics and five million documents for Russian. Active learning was used to determine which documents to annotate after being seeded using interactive search and judgment. Documents were judged on a three-grade relevance scale. This paper describes the design and construction of the new test collections and provides baseline results for demonstrating their utility for evaluating systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes