IRMar 3, 2021

University of Copenhagen Participation in TREC Health Misinformation Track 2020

Lucas Chaves Lima, Dustin Brandon Wright, Isabelle Augenstein, Maria Maistro

arXiv:2103.02462v15.12 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of filtering health misinformation for information retrieval systems, but it is incremental as it builds on existing methods like BM25 and Transformer models without introducing major innovations.

The paper tackled the problem of retrieving credible and non-misleading health information by developing a three-step approach involving BM25 and RM3 for initial retrieval, estimating credibility and misinformation scores using classifiers and stance detection, and merging scores for re-ranking, resulting in 11 runs for the Total Recall Task and 13 runs for the Ad Hoc task in the TREC Health Misinformation Track 2020.

In this paper, we describe our participation in the TREC Health Misinformation Track 2020. We submitted $11$ runs to the Total Recall Task and 13 runs to the Ad Hoc task. Our approach consists of 3 steps: (1) we create an initial run with BM25 and RM3; (2) we estimate credibility and misinformation scores for the documents in the initial run; (3) we merge the relevance, credibility and misinformation scores to re-rank documents in the initial run. To estimate credibility scores, we implement a classifier which exploits features based on the content and the popularity of a document. To compute the misinformation score, we apply a stance detection approach with a pretrained Transformer language model. Finally, we use different approaches to merge scores: weighted average, the distance among score vectors and rank fusion.

View on arXiv PDF

Similar