CLAIJun 6, 2024

FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

arXiv:2406.04233v25 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of limited educational resources for reading comprehension in less-resourced languages, though it is incremental as it adapts an existing dataset and methods.

The paper tackles the lack of question answering datasets in less-resourced languages by introducing machine-translated versions of FairytaleQA, establishing benchmarks for question generation and answering tasks, and evaluating a model for generating question-answer pairs with metrics like question well-formedness and children suitability.

Question Answering (QA) datasets are crucial in assessing reading comprehension skills for both machines and humans. While numerous datasets have been developed in English for this purpose, a noticeable void exists in less-resourced languages. To alleviate this gap, our paper introduces machine-translated versions of FairytaleQA, a renowned QA dataset designed to assess and enhance narrative comprehension skills in young children. By employing fine-tuned, modest-scale models, we establish benchmarks for both Question Generation (QG) and QA tasks within the translated datasets. In addition, we present a case study proposing a model for generating question-answer pairs, with an evaluation incorporating quality metrics such as question well-formedness, answerability, relevance, and children suitability. Our evaluation prioritizes quantifying and describing error cases, along with providing directions for future work. This paper contributes to the advancement of QA and QG research in less-resourced languages, promoting accessibility and inclusivity in the development of these models for reading comprehension. The code and data is publicly available at github.com/bernardoleite/fairytaleqa-translated.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes