IRApr 26, 2019
Recommending research articles to consumers of online vaccination informationEliza Harrison, Paige Martin, Didi Surian et al.
Online health communications often provide biased interpretations of evidence and have unreliable links to the source research. We tested the feasibility of a tool for matching webpages to their source evidence. From 207,538 eligible vaccination-related PubMed articles, we evaluated several approaches using 3,573 unique links to webpages from Altmetric. We evaluated methods for ranking the source articles for vaccine-related research described on webpages, comparing simple baseline feature representation and dimensionality reduction approaches to those augmented with canonical correlation analysis (CCA). Performance measures included the median rank of the correct source article; the percentage of webpages for which the source article was correctly ranked first (recall@1); and the percentage ranked within the top 50 candidate articles (recall@50). While augmenting baseline methods using CCA generally improved results, no CCA-based approach outperformed a baseline method, which ranked the correct source article first for over one quarter of webpages and in the top 50 for more than half. Tools to help people identify evidence-based sources for the content they access on vaccination-related webpages are potentially feasible and may support the prevention of bias and misrepresentation of research in news and social media.
IRSep 20, 2017
A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updatesDidi Surian, Adam G. Dunn, Liat Orenstein et al.
Clinical trial registries can be used to monitor the production of trial evidence and signal when systematic reviews become out of date. However, this use has been limited to date due to the extensive manual review required to search for and screen relevant trial registrations. Our aim was to evaluate a new method that could partially automate the identification of trial registrations that may be relevant for systematic review updates. We identified 179 systematic reviews of drug interventions for type 2 diabetes, which included 537 clinical trials that had registrations in ClinicalTrials.gov. We tested a matrix factorisation approach that uses a shared latent space to learn how to rank relevant trial registrations for each systematic review, comparing the performance to document similarity to rank relevant trial registrations. The two approaches were tested on a holdout set of the newest trials from the set of type 2 diabetes systematic reviews and an unseen set of 141 clinical trial registrations from 17 updated systematic reviews published in the Cochrane Database of Systematic Reviews. The matrix factorisation approach outperformed the document similarity approach with a median rank of 59 and recall@100 of 60.9%, compared to a median rank of 138 and recall@100 of 42.8% in the document similarity baseline. In the second set of systematic reviews and their updates, the highest performing approach used document similarity and gave a median rank of 67 (recall@100 of 62.9%). The proposed method was useful for ranking trial registrations to reduce the manual workload associated with finding relevant trials for systematic review updates. The results suggest that the approach could be used as part of a semi-automated pipeline for monitoring potentially new evidence for inclusion in a review update.