CLOct 7, 2020

Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations

Benjamin E. Nye, Jay DeYoung, Eric Lehman, Ani Nenkova, Iain J. Marshall, Byron C. Wallace

arXiv:2010.03550v32.724 citations

Originality Incremental advance

AI Analysis

This addresses the time-consuming and expensive manual extraction process for medical experts, though it is incremental as it builds on existing NLP methods.

The paper tackles the problem of extracting treatments, outcomes, and their relations from unstructured clinical trial reports to automate evidence synthesis, proposing a new method that outperforms data-driven baselines and demonstrating utility in a real-world drug repurposing evaluation.

The best evidence concerning comparative treatment effectiveness comes from clinical trials, the results of which are reported in unstructured articles. Medical experts must manually extract information from articles to inform decision-making, which is time-consuming and expensive. Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and, (b) inferring the reported results for the former with respect to the latter (relation extraction). We introduce new data for this task, and evaluate models that have recently achieved state-of-the-art results on similar tasks in Natural Language Processing. We then propose a new method motivated by how trial results are typically presented that outperforms these purely data-driven baselines. Finally, we run a fielded evaluation of the model with a non-profit seeking to identify existing drugs that might be re-purposed for cancer, showing the potential utility of end-to-end evidence extraction systems.

View on arXiv PDF

Similar