Large Language Models for Multi-Choice Question Classification of Medical Subjects
This addresses the challenge of automatic question answering in healthcare, but it is incremental as it builds on existing methods for a specific domain.
The paper tackled the problem of using large language models to classify multi-choice questions into medical subjects, achieving an accuracy of 0.68 on the development set and 0.60 on the test set of the MedMCQA dataset.
The aim of this paper is to evaluate whether large language models trained on multi-choice question data can be used to discriminate between medical subjects. This is an important and challenging task for automatic question answering. To achieve this goal, we train deep neural networks for multi-class classification of questions into the inferred medical subjects. Using our Multi-Question (MQ) Sequence-BERT method, we outperform the state-of-the-art results on the MedMCQA dataset with an accuracy of 0.68 and 0.60 on their development and test sets, respectively. In this sense, we show the capability of AI and LLMs in particular for multi-classification tasks in the Healthcare domain.