CLApr 30, 2018

Syntactic Patterns Improve Information Extraction for Medical Search

arXiv:1805.00097v11091 citations
Originality Incremental advance
AI Analysis

This work addresses information extraction for medical professionals searching literature, but it is incremental as it builds on existing models with pattern-based enhancements.

The paper tackled the problem of extracting medically relevant categories (patients, interventions, outcomes) from literature by incorporating syntactic patterns into state-of-the-art sequence tagging models, resulting in improved performance for medical search.

Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both linear and neural) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited, and the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes