CLJan 15, 2024

Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings

Saptarshi Sengupta, Connor Heaton, Suhan Cui, Soumalya Sarkar, Prasenjit Mitra

arXiv:2401.07977v31.04 citationsh-index: 5Has CodeBIBM

Originality Incremental advance

AI Analysis

This provides a more accessible approach for achieving domain proficiency in medical NLP without costly pre-training, though it is incremental as it builds on existing embedding alignment techniques.

The paper tackles the problem of expensive in-domain pre-training for medical question answering by proposing a resource-efficient method that aligns knowledge graph embeddings with pre-trained language models, achieving performance on par with or exceeding domain-specific models on COVID-QA and PubMedQA datasets.

In Natural Language Processing (NLP), Machine Reading Comprehension (MRC) is the task of answering a question based on a given context. To handle questions in the medical domain, modern language models such as BioBERT, SciBERT and even ChatGPT are trained on vast amounts of in-domain medical corpora. However, in-domain pre-training is expensive in terms of time and resources. In this paper, we propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training. Knowledge graphs are powerful resources for accessing medical information. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from medical knowledge graphs with the embedding spaces of pre-trained language models (LMs). The aligned embeddings are fused with open-domain LMs BERT and RoBERTa that are fine-tuned for two MRC tasks, span detection (COVID-QA) and multiple-choice questions (PubMedQA). We compare our method to prior techniques that rely on a vocabulary overlap for embedding alignment and show how our method circumvents this requirement to deliver better performance. On both datasets, our method allows BERT/RoBERTa to either perform on par (occasionally exceeding) with stronger domain-specific models or show improvements in general over prior techniques. With the proposed approach, we signal an alternative method to in-domain pre-training to achieve domain proficiency. Our code is available here.

View on arXiv PDF Code

Similar