CLMar 29, 2023

End-to-End $n$-ary Relation Extraction for Combination Drug Therapies

arXiv:2303.16886v18 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses the extraction of combination therapies from biomedical literature, which is crucial for researchers and clinicians dealing with diseases like cancer, HIV, malaria, or tuberculosis, and it is incremental as it builds on an existing dataset and improves upon prior methods.

The paper tackled the problem of extracting combination drug therapies from scientific literature, which is a dynamic n-ary relation extraction task, and achieved an F1-Score of 66.7% on the CombDrugExt test set, representing a 5% absolute improvement over prior non-end-to-end methods.

Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the "combination drug therapy" MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an $n$-ary relation extraction problem. Unlike in the general $n$-ary setting where $n$ is fixed (e.g., drug-gene-mutation relations where $n=3$), extracting combination therapies is a special setting where $n \geq 2$ is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, CombDrugExt, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of $66.7\%$ on the CombDrugExt test set for positive (or effective) combinations. This is an absolute $\approx 5\%$ F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic $n$-ary extraction scenarios.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes