CL AIJun 26, 2022

Contextual embedding and model weighting by fusing domain knowledge on Biomedical Question Answering

Yuxuan Lu, Jingya Yan, Zhixuan Qi, Zhongzheng Ge, Yongping Du

arXiv:2206.12866v11.15 citationsh-index: 14Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of biomedical question answering for researchers and practitioners in healthcare and life sciences, representing an incremental improvement through model fusion and weighting.

The authors tackled the challenge of biomedical question answering where models struggle to learn domain knowledge from limited training data by proposing a contextual embedding method that fuses an open-domain QA model with a biomedical pre-trained model and uses an MLP-based weighting layer. Their approach achieved state-of-the-art performance on the BioMRC dataset, outperforming existing systems by a large margin.

Biomedical Question Answering aims to obtain an answer to the given question from the biomedical domain. Due to its high requirement of biomedical domain knowledge, it is difficult for the model to learn domain knowledge from limited training data. We propose a contextual embedding method that combines open-domain QA model \aoa and \biobert model pre-trained on biomedical domain data. We adopt unsupervised pre-training on large biomedical corpus and supervised fine-tuning on biomedical question answering dataset. Additionally, we adopt an MLP-based model weighting layer to automatically exploit the advantages of two models to provide the correct answer. The public dataset \biomrc constructed from PubMed corpus is used to evaluate our method. Experimental results show that our model outperforms state-of-the-art system by a large margin.

View on arXiv PDF Code

Similar