Modern Question Answering Datasets and Benchmarks: A Survey
It provides a comprehensive overview for researchers in NLP and AI, but is incremental as it synthesizes existing datasets without new methods or results.
This survey investigates influential question answering datasets released in the deep learning era, covering textual and visual QA tasks, and discusses current research challenges.
Question Answering (QA) is one of the most important natural language processing (NLP) tasks. It aims using NLP technologies to generate a corresponding answer to a given question based on the massive unstructured corpus. With the development of deep learning, more and more challenging QA datasets are being proposed, and lots of new methods for solving them are also emerging. In this paper, we investigate influential QA datasets that have been released in the era of deep learning. Specifically, we begin with introducing two of the most common QA tasks - textual question answer and visual question answering - separately, covering the most representative datasets, and then give some current challenges of QA research.