Knowledge Transfer with Medical Language Embeddings
This addresses the challenge of knowledge synthesis in the medical domain, where data sparsity and scarcity make language modeling difficult, though it appears incremental as it builds on existing methods like distributional semantics and knowledge graph completion.
The paper tackled the problem of predicting new relationships between medical concepts by unifying structured knowledge graphs with unstructured text using a probabilistic generative model, achieving the ability to predict relationships for tokens not in the relational database.
Identifying relationships between concepts is a key aspect of scientific knowledge synthesis. Finding these links often requires a researcher to laboriously search through scien- tific papers and databases, as the size of these resources grows ever larger. In this paper we describe how distributional semantics can be used to unify structured knowledge graphs with unstructured text to predict new relationships between medical concepts, using a probabilistic generative model. Our approach is also designed to ameliorate data sparsity and scarcity issues in the medical domain, which make language modelling more challenging. Specifically, we integrate the medical relational database (SemMedDB) with text from electronic health records (EHRs) to perform knowledge graph completion. We further demonstrate the ability of our model to predict relationships between tokens not appearing in the relational database.