CLDec 27, 2021

A Survey on non-English Question Answering Dataset

Andreas Chandra, Affandy Fahrizain, Ibrahim, Simon Willyanto Laufried

arXiv:2112.13634v113 citations

Originality Synthesis-oriented

AI Analysis

It addresses the need for organized resources in non-English question answering for researchers and practitioners, but is incremental as it compiles existing datasets without introducing new methods or data.

This survey identifies, summarizes, and analyzes existing non-English question answering datasets, covering languages such as French, German, Japanese, Chinese, Arabic, Russian, as well as multilingual and cross-lingual resources.

Research in question answering datasets and models has gained a lot of attention in the research community. Many of them release their own question answering datasets as well as the models. There is tremendous progress that we have seen in this area of research. The aim of this survey is to recognize, summarize and analyze the existing datasets that have been released by many researchers, especially in non-English datasets as well as resources such as research code, and evaluation metrics. In this paper, we review question answering datasets that are available in common languages other than English such as French, German, Japanese, Chinese, Arabic, Russian, as well as the multilingual and cross-lingual question-answering datasets.

View on arXiv PDF

Similar