CLJul 29, 2020

Biomedical and Clinical English Model Packages in the Stanza Python NLP Library

arXiv:2007.14640v114 citations
Originality Synthesis-oriented
AI Analysis

This work provides accurate and efficient NLP tools for biomedical and clinical text processing, addressing a domain-specific need.

The authors introduced biomedical and clinical English model packages for the Stanza Python NLP library, achieving syntactic analysis and named entity recognition performance on par with or surpassing state-of-the-art results.

We introduce biomedical and clinical English model packages for the Stanza Python NLP library. These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text, by combining Stanza's fully neural architecture with a wide variety of open datasets as well as large-scale unsupervised biomedical and clinical text data. We show via extensive experiments that our packages achieve syntactic analysis and named entity recognition performance that is on par with or surpasses state-of-the-art results. We further show that these models do not compromise speed compared to existing toolkits when GPU acceleration is available, and are made easy to download and use with Stanza's Python interface. A demonstration of our packages is available at: http://stanza.run/bio.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes