IRCLJan 2, 2022

Establishing Strong Baselines for TripClick Health Retrieval

arXiv:2201.00365v113 citations
Originality Synthesis-oriented
AI Analysis

This work provides incremental improvements for health information retrieval by establishing stronger baselines on a specific dataset.

The authors tackled the problem of improving retrieval performance on the TripClick health collection by developing Transformer-based re-ranking and dense retrieval baselines, achieving large gains over BM25 with simple training enhancements.

We present strong Transformer-based re-ranking and dense retrieval baselines for the recently released TripClick health ad-hoc retrieval collection. We improve the - originally too noisy - training data with a simple negative sampling policy. We achieve large gains over BM25 in the re-ranking task of TripClick, which were not achieved with the original baselines. Furthermore, we study the impact of different domain-specific pre-trained models on TripClick. Finally, we show that dense retrieval outperforms BM25 by considerable margins, even with simple training procedures.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes