Transferability of Natural Language Inference to Biomedical Question Answering
This work addresses data scarcity and domain expertise issues in biomedical QA, presenting an incremental improvement through sequential transfer learning.
The paper tackled the challenge of biomedical question answering by transferring natural language inference knowledge to BioBERT, resulting in performance improvements of +5.59% for Yes/No, +0.53% for Factoid, and +13.58% for List type questions compared to a previous benchmark.
Biomedical question answering (QA) is a challenging task due to the scarcity of data and the requirement of domain expertise. Pre-trained language models have been used to address these issues. Recently, learning relationships between sentence pairs has been proved to improve performance in general QA. In this paper, we focus on applying BioBERT to transfer the knowledge of natural language inference (NLI) to biomedical QA. We observe that BioBERT trained on the NLI dataset obtains better performance on Yes/No (+5.59%), Factoid (+0.53%), List type (+13.58%) questions compared to performance obtained in a previous challenge (BioASQ 7B Phase B). We present a sequential transfer learning method that significantly performed well in the 8th BioASQ Challenge (Phase B). In sequential transfer learning, the order in which tasks are fine-tuned is important. We measure an unanswerable rate of the extractive QA setting when the formats of factoid and list type questions are converted to the format of the Stanford Question Answering Dataset (SQuAD).