CLJun 20, 2012

BADREX: In situ expansion and coreference of biomedical abbreviations using dynamic regular expressions

arXiv:1206.4522v115 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the need for accurate abbreviation expansion in biomedical text processing, though it is incremental over existing methods.

The paper tackles the problem of identifying and linking biomedical abbreviations to their definitions in text, achieving high precision and recall (98% and 97% on Medstract, 90% and 85% on a larger corpus).

BADREX uses dynamically generated regular expressions to annotate term definition-term abbreviation pairs, and corefers unpaired acronyms and abbreviations back to their initial definition in the text. Against the Medstract corpus BADREX achieves precision and recall of 98% and 97%, and against a much larger corpus, 90% and 85%, respectively. BADREX yields improved performance over previous approaches, requires no training data and allows runtime customisation of its input parameters. BADREX is freely available from https://github.com/philgooch/BADREX-Biomedical-Abbreviation-Expander as a plugin for the General Architecture for Text Engineering (GATE) framework and is licensed under the GPLv3.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes