CLIRLGMLMar 31, 2020

Enriching Consumer Health Vocabulary Using Enhanced GloVe Word Embedding

arXiv:2004.00150v210 citations
AI Analysis

This work addresses the need for more accessible medical terminology for laypeople, but it is incremental as it builds on existing word embedding methods.

The paper tackled the problem of enriching consumer health vocabulary by developing an enhanced word embedding technique that generates new laymen's terms from healthcare social media text, resulting in the detection of new CHV terms and outperforming unmodified GloVe.

Open-Access and Collaborative Consumer Health Vocabulary (OAC CHV, or CHV for short), is a collection of medical terms written in plain English. It provides a list of simple, easy, and clear terms that laymen prefer to use rather than an equivalent professional medical term. The National Library of Medicine (NLM) has integrated and mapped the CHV terms to their Unified Medical Language System (UMLS). These CHV terms mapped to 56000 professional concepts on the UMLS. We found that about 48% of these laymen's terms are still jargon and matched with the professional terms on the UMLS. In this paper, we present an enhanced word embedding technique that generates new CHV terms from a consumer-generated text. We downloaded our corpus from a healthcare social media and evaluated our new method based on iterative feedback to word embedding using ground truth built from the existing CHV terms. Our feedback algorithm outperformed unmodified GLoVe and new CHV terms have been detected.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes