Making Neural Machine Reading Comprehension Faster
This work addresses the need for faster inference in machine reading comprehension, which is incremental as it builds on existing BERT and knowledge distillation methods.
The study tackled the problem of slow inference time in neural machine reading comprehension by applying knowledge distillation to train smaller BERT-based models, achieving improved computational speed compared to other models with similar goals.
This study aims at solving the Machine Reading Comprehension problem where questions have to be answered given a context passage. The challenge is to develop a computationally faster model which will have improved inference time. State of the art in many natural language understanding tasks, BERT model, has been used and knowledge distillation method has been applied to train two smaller models. The developed models are compared with other models which have been developed with the same intention.